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Foreword 



By 

Phil Hester 

Nice President 

IBM RISC System/6000 Division — Systems and Technology 



IBM first introduced the POWER architecture with the RISC System/ 
6000 in early 1990. POWER, an acronym for "Performance Optimiza- 
tion With Enhanced RISC," was one of the first superscalar RISC micro- 
processors. The original microprocessor for these products consisted of a 
multi-chip implementation which set new performance and price/perfor- 
mance standards at the time of announcement. However, it soon became 
apparent that a single-chip version would be needed in order to include 
future lower cost members in the RISC System/6000 family. Systems 
would be needed that could span the range from personal computers 
through high end computers. As a result, work was started on a single- 
chip version of the POWER architecture. In early 1991 it became evident 
that this single-chip design could potentially become a high volume stan- 
dard in the industry. Accomplishing this objective would clearly require 
the development of a superior family of single-chip microprocessors, and 
the ability to supply these at competitive prices. IBM then began discus- 
sions with Motorola concerning potential collaboration to develop this 
family of microprocessors. As these discussions progressed, it became 
clear that both our general microprocessor requirements and our beUef 
that the high performance and low cost of these microprocessors could be 
exploited in systems was shared by another of Motorola's large custom- 
ers, Apple. This lead to discussions among Apple, IBM, and Motorola, in 
which it became clear that we did share common objectives for the micro- 
processors. Based on this, we could define complementary roles for each 
of the companies in the development of these microprocessors as well as 
the object-oriented and multi-media software technology needed to 
exploit them. 

Through both a commitment by all three companies to make this alii- 
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ance a success and the dedication of extremely talented people from each 
company, negotiations were completed and PowerPC ("PC" stands for 
"Performance Computing") was born in October of 1991. With Apple's 
personal computer systems experience, Motorola's high volume micro- 
processor and silicon knowledge, and IBM's RISC expertise and silicon 
technology capability, the components were in place to begin develop- 
ment of an entire family of RISC microprocessors which support a com- 
mon architecture and can span a wide range of computing requirements. 
Within six months a new development site, now known as the Somerset 
Design Center, was dedicated, and staffing was well underway for the 
design teams. 

Time to market for the first PowerPC microprocessor was viewed as a 
critical factor in the future success of PowerPC. As a result, an aggressive 
plan was put together to allow development of the first PowerPC micro- 
processor (PowerPC 601) as quickly as possible. It was also important 
that this microprocessor not sacrifice competitiveness as a result of its 
aggressive schedule. To achieve these goals, we combined the work 
already going on in IBM developing a single-chip version of the POWER 
architecture with the 88110 work going on in Motorola. This allowed 
us to develop a microprocessor very quickly that was compliant with 
the PowerPC architecture and utilized the expertise of both IBM and 
Motorola. 

Additional microprocessors were planned as part of the initial effort to 
be tailored to the various segments of the computing marketplace. The 
603 microprocessor is intended for very low end and battery powered 
products. Satisfying mid-range system needs, primarily optimizing price/ 
performance and permitting symmetric multiprocessor (SMP) scalability, 
is the major goal for the 604. The 620 is optimized for high end scientific 
and commercial environments, in which raw performance is critical. This 
microprocessor, with a goal of achieving the best possible performance 
from a single-chip technology, supports SMP and implements the 64-bit 
PowerPC architecture. In 32-bit mode, the 620 is fully compatible with 
all the 32-bit members of the PowerPC family (601, 603, and 604). 

As of November 1993 we are well on our way to delivering this family 
of PowerPC microprocessors. In September 1993 IBM announced the 
first product using the PowerPC 601— the RISC System/6000 Model 25. 
In October 1993 IBM and Motorola announced functional first silicon 
for the 603. It is important to note that the 601 and 603 were both com- 
pleted on schedule while also achieving their performance, functionality, 
and die size objectives. This success gives us high confidence that we will 
achieve similar results for the 604 and 620. In addition, by having fabri- 
cated the 603 separately in both IBM and Motorola manufacturing facil- 
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ities, we have validated the compatibility of our manufacturing processes. 
Since these same processes will be used for the 604 and 620, we feel con- 
fident of our ability to produce these microprocessors. 

Over time, the Somerset Design Center will continue to broaden and 
improve the product line to exploit both the PowerPC architecture and 
advances in silicon technology as they become available. Our vision of 
providing one architecture, implemented in a family of microprocessors 
each of which is optimized for a particular environment ranging from 
"palmtops to teraFLOPS," is well on the way to becoming a reality. 
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Preamble 

Some of us wanted to send this book to every household in America, or 
even in the world, so that everyone who might potentially be interested in 
PowerPC would easily be able to learn about it. But saner heads pre- 
vailed, realizing that some few households might not welcome such a gift, 
so the book is offered in the normal manner. 

About This Book 

The PowerPC Architecture supports a family of processors that spans a 
wide range of system and application environments. It also provides a 
stable base for software, allowing applications that run on one PowerPC 
processor to run consistently on any other PowerPC processor and well- 
designed operating systems to be moved from one processor implementa- 
tion to another by making a few minor changes. These desirable yet 
seemingly conflicting attributes are achieved by structuring the architec- 
ture specification into three Boofes, and relegating all implementation- 
specific aspects of the architecture to a fourth Book that is unique for 
each implementation. The first three Books correspond to three levels of 
the architecture, as follows. 

■ Book I, User Instruction Set Architecture 

This Book describes the registers, instructions, storage model, and exe- 
cution model that are available to all application programs. 

■ Book n. Virtual Environment Architecture 

This Book describes features of the architecture that permit applica- 
tion programs to create or modify code, to share data among pro- 
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grams in a multiprocessing system, and to optimize the performance of 
storage accesses. 

■ Book in, Operating Environment Architecture 

This Book describes features of the architecture that permit operating 
systems to allocate and manage storage, to handle errors encountered 
by application programs, to support I/O devices, and to provide the 
other services expected of secure, modern, multiprocessor operating 
systems. 

This volume consists of Books I, II, and III. The fourth Book, called 
Book IV, PowerPC Implementation Features^ differs for each implemen- 
tation and is not included herein. 

An important attribute of Books I, II, and III is that they do not con- 
strain implementations on matters that v^ould not affect software com- 
patibility. For example, the effects of executing an invalidly coded 
instruction are not defined and can differ between implementations. 
Compilers and assemblers are responsible for generating only correctly 
coded instructions. 

An even more important attribute of these three Books is that they 
specify the architecture in a manner that is independent of implementa- 
tion. For example. Book III specifies the rules by which storage addresses 
are translated from the "effective addresses" generated by a program to 
the "real addresses" that are used to access storage, including the format 
of related tables and registers that are needed by the operating system. 
However, it does not specify how a processor should accomplish the 
translation. Thus, it permits translation lookaside buffers (TLBs) to be 
used, but does not require that they be used and does not specify their 
organization or contents. Book IV for each processor specifies all such 
implementation details, and use of the related facilities can be isolated to 
small portions of the operating system. 

All PowerPC processors conform to Book I. The PowerPC processors 
being developed jointly by Motorola and IBM for the general computer 
market conform to Books II and III as well. Other implementations may 
support only a subset of the features described in these two Books. (In 
effect, such an implementation would have its own private Book II or III.) 
For example, a processor used as an embedded controller might conform 
to Books I and II but implement a simpler storage model than the one 
described in Book III. 

Because the features described in Book III are available only to "privi- 
leged" programs such as operating systems, application binary compati- 
bility is assured even for processors that implement a different Book III. 
The ability to change portions of Book III in the future may prove very 
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useful, to support new developments in operating system and hardware 
technology and unforeseen processor requirements. 

As used in this volume, the term "PowerPC Architecture" refers gener- 
ically to the instructions and facilities described in Books I, II, and III. 
However, it is important to remember that these Books define three dis- 
tinct levels of the architecture, and thus three different levels of compati- 
bility. 
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Introduction 



1.1 Overview 

This chapter describes computation modes, compatibiHty with the 
POWER Architecture, document conventions, a processor overview, 
instruction formats, storage addressing, and instruction fetching. 



1.2 Computation iViodes 

The PowerPC Architecture allows for the following types of implementa- 
tion: 

■ 64-bit implementations, in which all registers except some Special Pur- 
pose Registers are 64 bits long and effective addresses are 64 bits long. 
All 64-bit implementations have two modes of operation: 64-bit mode 
and 32-bit mode. The mode controls how the effective address is in- 
terpreted, how status bits are set, and how the Count Register is tested 
by Branch Conditional instructions. All instructions provided for 64- 
bit implementations are available in both modes. 

■ 32-bit implementations, in which all registers except Floating-Point 
Registers are 32 bits long and effective addresses are 32 bits long. 

Instructions defined in this document are provided in both 64-bit 
implementations and 32-bit implementations unless otherwise stated. 
Instructions that are provided only for 64-bit implementations are illegal 
in 32-bit implementations, and vice versa. 
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1.2.1 64-bit Implementations 



In both 64-bit mode and 32-bit mode of a 64-bit implementation, in- 
structions that set a 64-bit register affect all 64 bits, and the value placed 
into the register is independent of mode. In both modes, effective address 
computations use all 64 bits of the relevant registers (General Purpose 
Registers, Link Register, Count Register, etc.) and produce a 64-bit re- 
sult. However, in 32-bit mode, the high-order 32 bits of the computed ef- 
fective address are ignored when accessing data and are set to 0 when 
fetching instructions. 



For a 32-bit implementation, all references to 64-bit mode in this docu- 
ment should be disregarded. The semantics of instructions are as shown 
in this document for 32-bit mode in a 64-bit implementation, except that 
in a 32-bit implementation all registers except Floating-Point Registers 
are 32 bits long. Bit numbers for registers are shown in braces ({ }) when 
they differ from the corresponding numbers for a 64-bit implementation, 
as described in Section 1.5.1, "Definitions and Notation," on page 5. 



The description of each instruction includes the mnemonic and a format- 
ted list of operands. Some examples are the following. 



PowerPC-compliant assemblers will support the mnemonics and oper- 
and lists exactly as shown. They will also provide certain extended mne- 
monics, as described in Appendix C, "Assembler Extended Mnemonics," 
on page 215. 



1.4 Compatibility with the POWER 
Architecture 



The PowerPC Architecture provides binary compatibility for POWER ap- 
plication programs, except as described in Appendix G, "Incompatibili- 
ties with the POWER Architecture," on page 271. 



1.2.2 32-bit Implementations 



1.3 instruction iVinemonics and 
Operands 



stw 
addis 



RS,D(RA) 
RT,RA,SI 
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Many of the PowerPC instructions are identical to POWER instruc- 
tions. For some of these the PowerPC instruction name and/or mnemonic 
differs from that in POWER. To assist readers famiUar with the POWER 
Architecture, POWER mnemonics are shown with the individual instruc- 
tion descriptions when they differ from the PowerPC mnemonics. Also, 
Appendix F, "Cross-Reference for Changed POWER Mnemonics," on 
page 267, provides a cross-reference from POWER mnemonics to 
PowerPC mnemonics for the instructions in this document. 

1.5 Document Conventions 
1.5.1 Definitions and Notation 

The following definitions and notation are used throughout the PowerPC 
Architecture documents. 

■ A program is a sequence of related instructions. 

■ Quadwords are 128 bits, doublewords are 64 bits, words are 32 bits, 

halfwords are 16 bits, and bytes are 8 bits. 

■ All numbers are decimal unless specified in some special way. 

— Obnnnn means a number expressed in binary format. 

— Oxnnnn means a number expressed in hexadecimal format. 
Underscores may be used between digits. 

■ RT, RA, Rl, ... refer to General Purpose Registers. 

■ FRT, FRA, FRl, ... refer to Floating-Point Registers. 

■ (x) means the contents of register x, where x is the name of an instruc- 
tion field. For example, (RA) means the contents of register RA, and 
(FRA) means the contents of register FRA, where RA and FRA are in- 
struction fields. Names such as LR and CTR denote registers, not 
fields, so parentheses are not used with them. Parentheses are also 
omitted when register x is the register into which the result of an oper- 
ation is placed. 

■ (RAIO) means the contents of register RA if the RA field has the value 
1-31, or the value 0 if the RA field is 0. 
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■ Bits in registers, instructions, and fields are specified as follows. 

— Bits are numbered left to right, starting with bit 0. 

— Ranges of bits are specified by two numbers separated by a colon 
(:). The range p:q consists of bits p through q. 

— For registers that are 64 bits long in 64-bit implementations and 32 
bits long in 32-bit implementations, bit numbers and ranges are 
specified with the values for 32-bit implementations enclosed in 
braces ({ }). {} means a bit that does not exist in 32-bit implementa- 
tions. {:} means a range that does not exist in 32-bit implementa- 
tions. 

■ Xp means bit p of register/field X. 

Xp{r} means bit p of register/field X in a 64-bit implementation, and bit 
r of register/field X in a 32-bit implementation. 

■ Xp.q means bits p through q of register/field X. 

Xp.qjr-s) means bits p through q of register/field X in a 64-bit imple- 
mentation, and bits r through s of register/field X in a 32-bit imple- 
mentation. 

■ Xp q means bits p, q, ... of register/field X. 

Xp q s ...} means bits p, q, ... of register/field X in a 64-bit imple- 
mentation, and bits r, s, ... of register/field X in a 32-bit implementa- 
tion. 

■ -n(RA) means the one's complement of the contents of register RA. 

■ Field i refers to bits 4 x i through 4 x i + 3 of a register. 

■ A period (.) as the last character of an instruction mnemonic means 
that the instruction records status information in certain fields of the 
Condition Register as a side effect of execution, as described in Chap- 
ter 2 through Chapter 4. 

■ The symbol II is used to describe the concatenation of two values. For 
example, 010 II 111 is the same as 010111. 

■ x^ means x raised to the n^^ power. 

■ ^x means the replication of x, n times (i.e., x concatenated to itself n-1 
times). "0 and "1 are special cases: 

— ^0 means a field of n bits with each bit equal to 0. Thus ^0 is equiv- 
alent to ObOOOOO. 
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— ^1 means a field of n bits with each bit equal to 1. Thus 1 is equiv- 
alent toOblllll. 

■ Positive means greater than zero. 

■ Negative means less than zero. 

■ A system library program is a component of the system software that 
can be called by an application program using a Branch instruction. 

■ A system service program is a component of the system software that 
can be called by an application program using a System Call instruc- 
tion. 

■ The system trap handler is a component of the system software that 
receives control when the conditions specified in a Trap instruction are 
satisfied. 

■ The system error handler is a component of the system software that 
receives control when an error occurs. The system error handler in- 
cludes a component for each of the various kinds of error. These er- 
ror-specific components are referred to as the system alignment error 
handler, the system data storage error handler, etc. 

■ Each bit and field in instructions, and in status and control registers 
(XER and FPSCR) and Special Purpose Registers, is either defined or 
reserved. 

■ /, //, ///, . . . denotes a reserved field in an instruction. 

■ Latency refers to the interval from the time an instruction begins exe- 
cution until it produces a result that is available for use by a subse- 
quent instruction. 

■ Unavailable refers to a resource that cannot be used by the program. 
Data or instruction storage is unavailable if an instruction is denied ac- 
cess to it. Floating-point instructions are unavailable if use of them is 
denied. See Book III, PowerPC Operating Environment Architecture, 

■ The results of executing a given instruction are said to be boundedly 
undefined if they could have been achieved by executing an arbitrary 
sequence of instructions, starting in the state the machine was in be- 
fore executing the given instruction. Boundedly undefined results for a 
given instruction may vary between implementations, and between 
different executions on the same implementation, and are not further 
defined in this document. 
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Programming Note 

It is the responsibility of 
software to preserve bits 
that are now reserved in 
status and control 
registers and in Special 
Purpose Registers (and 
Segment Registers: see 
Book III, PowerPC 
Operating Erivironment 
Architecture), as they 
may be assigned a 
meaning in some future 
version of the 
architecture. 

In order to accomplish 
this preservation in 
implementation- 
independent fashion, 
software should do the 
following. 

■ Initialize each such 
register supplying 
zeros for all reserved 
bits. 

■ Alter (defined) bit(s) in 
the register by reading 
the register, altering 
only the desired bit(s), 
and then writing the 
new value back to the 
register. 

The XER and FPSCR are 
partial exceptions to this 
recommendation. 
Software can alter the 
status bits in these 
registers, preserving the 
reserved bits, by 
executing instructions 
that have the side effect 
of altering the status bits. 
Similarly, software can 
alter any defined bit in 
the FPSCR by executing a 
Floating-Point Status and 
Control Register 
instruction. Using such 
instructions is likely to 



1.5.2 Reserved Fields 

All reserved fields in instructions should be zero. If they are not, the in- 
struction form is invalid: see Section 1.9.2, "Invalid Instruction Forms," 
on page 25. 

The handling of reserved bits in status and control registers (XER and 
FPSCR) and in Special Purpose Registers (and Segment Registers: see 
Book III, Section 4.5, "Segmented Address Translation, 32-Bit Implemen- 
tations," on page 412) is implementation dependent. For each such 
reserved bit, an implementation shall either: 

■ ignore the source value for the bit on write, and return zero for it on 
read; or 

■ set the bit from the source value on write, and return the value last set 
for it on read. 



1.5.3 Description of instruction Operation 

A formal description is given of the operation of each instruction. In ad- 
dition, the operation of most instructions is described by a semiformal 
language at the register transfer level (RTL). This RTL uses the notation 
given below, in addition to the definitions and notation described in Sec- 
tion 1.5.1, "Definitions and Notation," on page 5. Some of this notation 
is also used in the formal descriptions of instructions. RTL notation not 
summarized here should be self-explanatory. 

The RTL descriptions cover the normal execution of the instruction, 
except that "standard" setting of the Condition Register, Fixed-Point 
Exception Register, and Floating-Point Status and Control Register are 
not shown, ("Non-standard" setting of these registers, such as the setting 
of Condition Register Field 0 by the stwcx. instruction, is shown.) The 
RTL descriptions do not cover cases in which the system error handler is 
invoked, or for which the results are boundedly undefined. 

The RTL descriptions specify the architectural transformation per- 
formed by the execution of an instruction. They do not imply any partic- 
ular implementation. 

Notation Meaning 

Assignment 

<-iea Assignment of an instruction effective address. In 32-bit 

mode of a 64-bit implementation, the high-order 32 bits of 
the 64-bit target are set to 0. 
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V 

<, <, >, > 

? 



ABS(x) 

CEIL(x) 

DOUBLE(x) 

EXTS(x) 
GPR(x) 
MASK(x, y) 

MEM(x, y) 

ROTL64(x, y) 
ROTL32(x, y) 

SINGLE(x) 



NOT logical operator 

Two's complement addition 

Two's complement subtraction, unary minus 

Multiplication 

Division (yielding quotient) 

Square root 

Equals, Not Equals relations 
Signed comparison relations 
Unsigned comparison relations 
Unordered comparison relation 
AND, OR logical operators 

Exclusive OR, Equivalence logical operators 

((a=b) = (ae-.b)) 



Absolute value of x 
Least integer > x 

Result of converting x from floating-point single format to 
floating-point double format, using the model shown on 
page 168 

Result of extending x on the left with sign bits 
General Purpose Register x 

Mask having Is in positions x through y (wrapping if 
x > y) and Os elsewhere 

Contents of y bytes of memory starting at address x. In 32- 
bit mode of a 64-bit implementation, the high-order 32 
bits of the 64-bit value x are ignored. 

Result of rotating the 64-bit value x left y positions 

Result of rotating the 64-bit value xllx left y positions, 
where x is 32 bits long 

Result of converting x from floating-point double format 
to floating-point single format, using the model shown on 
page 173 



yield better performance 
than using the method 
described in the second 
item above. 

When a currently 
reserved bit is 
subsequently assigned a 
meaning, every effort 
will be made to have the 
value to which the system 
initializes the bit 
correspond to the "old 
behavior." 



SPREG(x) Special Purpose Register x 
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TRAP Invoke the system trap handler 

characterization Reference to the setting of status bits, in a standard way 
that is explained in the text 

undefined An undefined value. The value may vary between imple- 
mentations, and between different executions on the same 
implementation. 

CIA Current Instruction Address, which is the 64{32}-bit ad- 

dress of the instruction being described by a sequence of 
RTL. Used by relative branches to set the Next Instruction 
Address (NIA) and by Branch instructions with LK=1 to 
set the Link Register. In 32-bit mode of 64-bit implemen- 
tations, the high-order 32 bits of CIA are always set to 0. 
Does not correspond to any architected register. 

NIA Next Instruction Address, which is the 64{32}-bit address 

of the next instruction to be executed. For a successful 
branch, the next instruction address is the branch target 
address: in RTL, this is indicated by assigning a value to 
NIA. For other instructions that cause nonsequential in- 
struction fetching (see Book III, Section 2.3.1, "System 
Linkage Instructions," on page 378), the RTL is similar. 
For instructions that do not branch, and do not otherwise 
cause instruction fetching to be nonsequential, the next in- 
struction address is CIA+4. In 32-bit mode of 64-bit imple- 
mentations, the high-order 32 bits of NIA are always set to 
0. Does not correspond to any architected register. 

if ... then ... else ... Conditional execution, indenting shows range; else 
is optional 

do Do loop, indenting shows range. "To" and/or "by" 

clauses specify incrementing an iteration variable, and a 
"while" clause gives termination conditions. 

leave Leave innermost do loop, or do loop described in leave 

statement 

The precedence rules for RTL operators are summarized in Table 1 on 
page 11. Operators higher in the table are applied before those lower in 
the table. Operators at the same level in the table associate from left to 
right, from right to left, or not at all, as shown. (For example, - associ- 
ates from left to right, so a-b-c = (a-b)-c.) Parentheses are used to over- 
ride the evaluation order implied by the table or to increase clarity: 
parenthesized expressions are evaluated before serving as operands. 
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Operators 


Associativity 


cnKcrriT^I" T^^^nnti^\r\ pv^i lnci<"i'r\n 


Ipff tn ricrVif 




YlCrPlf tr\ IPTf 




fiarltt tci ipft 




Ipft tn ricrVit 




left to right 


II 


Ifft trt i*icrnt 


=, <, <, >, >, ^ , ? 


left to right 


&,e,= 


left to right 


1 


left to right 


: (range) 


none 


<- 


none 



Table 1 . Operator precedence 



1.6 Processor Overview 

The processor implements the instruction set, the storage model, and 
other facilities defined in Book I. Instructions that the processor can exe- 
cute fall into three classes: 

■ branch instructions, 

■ fixed-point instructions, and 

■ floating-point instructions. 

Branch instructions are described in Section 2.4, "Branch Processor 
Instructions," on page 35. Fixed-point instructions are described in Sec- 
tion 3.3, "Fixed-Point Processor Instructions," on page 49. Floating- 
point instructions are described in Section 4.6, "Floating-Point Processor 
Instructions," on page 167. 

Fixed-point instructions operate on byte, halfword, word, and, in 64- 
bit implementations, doubleword operands. Floating-point instructions 
operate on single-precision and double-precision floating-point operands. 
The PowerPC Architecture uses instructions that are four bytes long and 
word-aligned. It provides for byte, halfword, word, and, in 64-bit imple- 
mentations, doubleword operand fetches and stores between storage and 
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Figure 1. Logical processing model 

a set of 32 General Purpose Registers (GPRs). It also provides for word 
and doubleword operand fetches and stores between storage and a set of 
32 Floating-Point Registers (FPRs). 

Signed integers are represented in two's complement form. 

There are no computational instructions that modify storage. To use a 
storage operand in a computation and then modify the same or another 
storage location, the content of storage must be loaded into a register, 
modified, and then stored back to the target location. Figure 1 is a logical 
representation of instruction processing. Figure 2 on page 13 shows the 
registers of the PowerPC User Instruction Set Architecture. 



1.7 Instruction Formats 

All instructions are four bytes long and word-aligned. Thus, whenever in- 
struction addresses are presented to the processor (as in Branch instruc- 
tions) the two low-order bits are ignored. Similarly, whenever the 
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Figure 2. PowerPC user register set 

processor develops an instruction address, its two low-order bits are zero. 

Bits 0:5 always specify the opcode (OPCD, below). Many instructions 
also have an extended opcode (XO, below). The remaining bits of the 
instruction contain one or more fields as shown below for the different 
instruction formats. 

The format diagrams given below show horizontally all valid combi- 
nations of instruction fields. The diagrams include instruction fields that 
are used only by instructions defined in Book II, PowerPC Virtual Envi- 
ronment Architecture, or in Book III, PowerPC Operating Environment 
Architecture, See those Books for the definitions of such fields. 

In some cases an instruction field is reserved, or must contain a partic- 
ular value. If a reserved field does not have all bits set to 0, or if a field 
that must contain a particular value does not contain that value, the 
instruction form is invalid and the results are as described in 
Section 1.9.2, "Invalid Instruction Forms," on page 25. 
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Split Field Notation 

In some cases an instruction field occupies more than one contiguous se- 
quence of bits, or occupies one contiguous sequence of bits that are used 
in permuted order. Such a field is called a "split field." In the format dia- 
grams given below and in the individual instruction layouts, the name of 
a split field is shown in small letters, once for each of the contiguous se- 
quences. In the RTL description of an instruction having a split field, and 
in certain other places where individual bits of a split field are identified, 
the name of the field in small letters represents the concatenation of the 
sequences from left to right. In all other places, the name of the field is 
capitalized and represents the concatenation of the sequences in some or- 
der, which need not be left to right, as described for each affected instruc- 
tion. 
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Figure 3. I Instruction format 



B-Form 
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Figure 4. B instruction format 










SC-Form 
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Figure 5. SC instruction format 
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Figure 6. D instruction format 



DS-Form 
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Figure 7. DS instruction format (64-bit implementations only) 
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Figure 8. X instruction format 
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XL-Form 



OPCD 


BT 


BA 


BB 


XO 


/ 


OPCD 


BO 


BI 


III 


XO 


LK 


OPCD 


BF // 


BFA // 


III 


XO 


/ 


OPCD 


III 


III 


III 


XO 


/ 



0 6 11 16 21 31 



Figure 9. XL instruction format 



XFX-Form 
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Figure 10. XFX instruction format 



XFL-Form 
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Figure 1 1 . XFL instruction format 
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Figure 12. XS instruction format (64-bit implementations only) 
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XO-Form 
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Figure 13. XO instruction format 



A-Form 
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Figure 14. A instruction format 



M-Form 
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Figure 15. IM instruction format 



MD-Form 
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Figure 16. MD instruction format (64-bit implementations only) 
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MDS-Form 
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Figure 17. MDS instruction format (64-bit implementations only) 



1.7.1 Instruction Fields 

AA (30) 

Absolute Address bit 

0 The immediate field represents an address relative to the current in- 
struction address. For I-form branches the effective address of the 
branch target is the sum of the LI field sign-extended to 64 bits and 
the address of the branch instruction. For B-form branches the effec- 
tive address of the branch target is the sum of the BD field sign-ex- 
tended to 64 bits and the address of the branch instruction. 

1 The immediate field represents an absolute address. For I-form 
branches the effective address of the branch target is the LI field 
sign-extended to 64 bits. For B-form branches the effective address 
of the branch target is the BD field sign-extended to 64 bits. 

BA (11:15) 

Field used to specify a bit in the CR to be used as a source. 
BB (16:20) 

Field used to specify a bit in the CR to be used as a source. 
BD (16:29) 

Immediate field specifying a 14-bit signed two's complement branch 
displacement, which is concatenated on the right with ObOO and sign- 
extended to 64 bits. 

BF (6:8) 

Field used to specify one of the CR fields or one of the FPSCR fields to 
be used as a target. 

BFA (11:13) 

Field used to specify one of the CR fields or one of the FPSCR fields to 
be used as a source. 
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BI (11:15) 

Field used to specify a bit in the CR to be used as the condition of a 
Branch Conditional instruction. 

BO (6:10) 

Field used to specify options for the Branch Conditional instructions. 
The encoding is described in Section 2.4, "Branch Processor Instruc- 
tions," on page 35. 

BT (6:10) 

Field used to specify a bit in the CR or in the FPSCR to be used as a 
target. 

D (16:31) 

Immediate field specifying a 16-bit signed two's complement integer 
which is sign-extended to 64 bits. 

DS (16:29) 

Immediate field specifying a 14-bit signed two's complement integer 
that is concatenated on the right with ObOO and sign-extended to 64 
bits. This field is defined in 64-bit implementations only. 

FLM (7:14) 

Field mask used to identify the FPSCR fields that are to be updated by 
the w/fs/^ instruction. 

FRA (11:15) 

Field used to specify an FPR to be used as a source. 
FRB (16:20) 

Field used to specify an FPR to be used as a source. 
FRC (21:25) 

Field used to specify an FPR to be used as a source. 
FRS (6:10) 

Field used to specify an FPR to be used as a source. 
FRT (6:10) 

Field used to specify an FPR to be used as a target. 
FXM (12:19) 

Field mask used to identify the CR fields that are to be updated by the 
mtcrf instruction. 
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L(10) 

Field used to specify whether a Fixed-Point Compare instruction is to 
compare 64-bit numbers or 32-bit numbers. This field is defined in 64- 
bit implementations only. 

LI (6:29) 

Immediate field specifying a 24-bit signed two's complement integer 
which is concatenated on the right with ObOO and sign-extended to 64 
bits. 

LK(31) 
LINK bit. 

0 Do not set the Link Register. 

1 Set the Link Register. If the instruction is a Branch instruction, the 
address of the instruction following the Branch instruction is placed 
into the Link Register. 

MB (21:25) and ME (26:30) 

Fields used in M-form instructions to specify a 64-bit mask consisting 
of 1-bits from bit MB+32 through bit ME+32 inclusive and 0-bits else- 
where, as described in Section 3.3.13, "Fixed-Point Rotate and Shift In- 
structions," on page 115. 

MB (21:26) 

Field used in MD-form and MDS-form instructions to specify the first 
1-bit of a 64-bit mask, as described in Section 3.3.13, "Fixed-Point Ro- 
tate and Shift Instructions," on page 115. This field is defined in 64-bit 
implementations only. 

ME (21:26) 

Field used in MD-form and MDS-form instructions to specify the last 
1-bit of a 64-bit mask, as described in Section 3.3.13, "Fixed-Point Ro- 
tate and Shift Instructions," on page 115. This field is defined in 64-bit 
implementations only. 

NB (16:20) 

Field used to specify the number of bytes to move in an immediate 
string load or store. 

OPCD (0:5) 

Primary opcode field. 
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OE (21) 

Used for extended arithmetic to enable setting OV and SO in the XER. 
RA (11:15) 

Field used to specify a GPR to be used as a source or as a target. 
RB (16:20) 

Field used to specify a GPR to be used as a source. 

Rc (31) 

RECORD bit 

0 Do not alter the Condition Register. 

1 Set Condition Register Field 0 or Field 1 as described in Section 
2.3.1, "Condition Register," on page 32. 

RS (6:10) 

Field used to specify a GPR to be used as a source. 
RT (6:10) 

Field used to specify a GPR to be used as a target. 

SH (16:20, or 16:20 and 30) 

Field used to specify a shift amount. Location 16:20 and 30 pertains to 
64-bit implementations only. 

SI (16:31) 

Immediate field used to specify a 16-bit signed integer. 
SPR (11:20) 

Field used to specify a Special Purpose Register for the mtspr and mfspr 
instructions. The encoding is described in Section 3.3.14, "Move to/ 
from System Register Instructions," on page 128. 

SR (12:15) 

See Book III, Section 1.5.1, "Instruction Fields," on page 370. 
TBR (11:20) 

See Book II, Section 4.1, "Time Base Instructions," on page 352. 
TO (6:10) 

Field used to specify the conditions on which to trap. The encoding is 
described in Section 3.3.11, "Fixed-Point Trap Instructions," on 
page 101. 
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U (16:19) 

Immediate field used as the data to be placed into a field in the FPSCR. 
UI (16:31) 

Immediate field used to specify a 16-bit unsigned integer. 

XO (21:29, 21:30, 22:30, 26:30, 27:29, 27:30, or 30:31) 

Extended opcode field. Locations 21:29, 27:29, 27:30, and 30:31 per- 
tain to 64-bit implementations only. 

1.8 Classes of Instructions 

An instruction falls into exactly one of the following three classes: 

■ Defined 

■ Illegal 

■ Reserved 

The class is determined by examining the opcode, and the extended 
opcode if any. If the opcode, or combination of opcode and extended 
opcode, is not that of a defined instruction or of a reserved instruction, 
the instruction is illegal. 

Some instructions are defined only for 64-bit implementations and a 
few are defined only for 32-bit implementations (see Section 1.8.2, "Ille- 
gal Instruction Class," on page 24). With the exception of these, a given 
instruction is in the same class for all implementations of the PowerPC 
Architecture. In future versions of this architecture, instructions that are 
now illegal may become defined (by being added to the architecture) or 
reserved (by being assigned to one of the special purposes described in 
Appendix], "Reserved Instructions," on page 293). Similarly, instruc- 
tions that are now reserved may become defined. 

1.8.1 Defined Instruction Class 

This class of instructions contains all the instructions defined in the 
PowerPC User Instruction Set Architecture, PowerPC Virtual Environ- 
ment Architecture, and PowerPC Operating Environment Architecture. 

Defined instructions are guaranteed to be supported in all implementa- 
tions, except as stated in the instruction descriptions. (The exceptions are 
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instructions that are supported only in 64-bit implementations or only in 
32-bit implementations.) 

A defined instruction can have preferred and/or invalid forms, as de- 
scribed in Section 1.9.1, "Preferred Instruction Forms," on page 25, and 
Section 1.9.2, "Invalid Instruction Forms," on page 25. 

1.8.2 Illegal Instruction Class 

This class of instructions contains the set of instructions described in 
Appendix I, "Illegal Instructions," on page 291. For 64-bit implementa- 
tions, this class includes all instructions that are defined only for 32-bit 
implementations. For 32-bit implementations, it includes all instructions 
that are defined only for 64-bit implementations. 

Excluding instructions that are defined for one type of implementation 
but not the other, illegal instructions are available for future extensions of 
the PowerPC Architecture: that is, some future version of the PowerPC 
Architecture may define any of these instructions to perform new func- 
tions. 

Any attempt to execute an illegal instruction will cause the system ille- 
gal instruction error handler to be invoked and will have no other effect. 

An instruction consisting entirely of binary Os is guaranteed always to 
be an illegal instruction. This increases the probability that an attempt to 
execute data or uninitialized storage will result in the invocation of the 
system illegal instruction error handler. 

1.8.3 Reserved Instruction Class 

This class of instructions contains the set of instructions described in 
Appendix J, "Reserved Instructions," on page 293. 

Reserved instructions are allocated to specific purposes that are out- 
side the scope of the PowerPC Architecture. 

Any attempt to execute a reserved instruction will: 

■ perform the actions described in the Book IV, PowerPC Implementa- 
tion Features for the implementation if the instruction is implemented; 
or 

■ cause the system illegal instruction error handler to be invoked if the 
instruction is not implemented. 
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1.9 Forms of Defined Instructions 



1.9.1 Preferred Instruction Forms 

Some of the defined instructions have preferred forms. For such an in- 
struction, the preferred form will execute in an efficient manner, but any 
other form may take significantly longer to execute than the preferred 
form. 

Instructions having preferred forms are: 

■ the Load/Store Multiple instructions 

■ the Load/Store String instructions 

■ the Or Immediate instruction (preferred form of no-op) 



1.9.2 Invalid Instruction Forms 



Some of the defined instructions have invalid forms. An instruction form 
is invalid if one or more fields of the instruction, excluding the opcode 
field(s), are coded incorrectly in a manner that can be deduced by exam- 
ining only the instruction encoding. 

Any attempt to execute an invalid form of an instruction will either 
cause the system illegal instruction error handler to be invoked or yield 
boundedly undefined results. Exceptions to this rule are stated in the 
instruction descriptions. 

Some kinds of invaUd form can be deduced from the instruction lay- 
out. These are listed below. 

■ Field shown as 7'(s) but coded as nonzero. 

■ Field shown as containing a particular value but coded as some other 
value. 

These invalid forms are not discussed further. 

Instructions having invalid forms that cannot be so deduced are listed 
below. These kinds of invalid form are identified in the instruction 
descriptions. 

■ the Branch Conditional instructions 

■ the Load/Store with Update instructions 

■ the Load Multiple instruction 



Assembler Note 

To the extent possible, 
the Assembler should 
report uses of invalid 
instruction forms as 
errors. 
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■ the Load String instructions 

■ the Fixed-Point Compare instructions (invalid form exists only in 32- 
bit implementations) 

■ the Load/Store Floating-Point with Update instructions 

1.9.3 Optional Instructions 

Some of the defined instructions are optional. The optional instructions 
are defined in Appendix A, "Optional Instructions," on page 197, and 
also in Book III, Section 4.11.3, "Lookaside Buffer Management Instruc- 
tions (Optional)," on page 442 and Book III, Appendix A, "Optional Fa- 
cilities and Instructions," on page 489. 

Any attempt to execute an optional instruction that is not provided by 
the implementation will cause the system illegal instruction error handler 
to be invoked. Exceptions to this rule are stated in the instruction descrip- 
tions. 

1.10 Exceptions 

There are two kinds of exception, those caused directly by the execution 
of an instruction and those caused by an asynchronous event. In either 
case, the exception may cause one of several components of the system 
software to be invoked. 

The exceptions that can be caused directly by the execution of an 
instruction include the following: 

■ an attempt to execute an illegal instruction, or an attempt by an appli- 
cation program to execute a "privileged" instruction [see Book III, 
Section 5.5.7, "Program Interrupt," on page 467 (system illegal in- 
struction error handler or system privileged instruction error handler)] 

■ the execution of a defined instruction using an invalid form (system il- 
legal instruction error handler or system privileged instruction error 
handler) 

■ the execution of an optional instruction that is not provided by the im- 
plementation (system illegal instruction error handler) 

■ an attempt to access a storage location that is unavailable (system er- 
ror handler) 

■ an attempt to access storage with an effective address alignment that is 
invalid for the instruction (system alignment error handler) 
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■ the execution of a System Call instruction (system service program) 

■ the execution of a Trap instruction that traps (system trap handler) 

■ the execution of a floating-point instruction when floating-point in- 
structions are unavailable (system floating-point unavailable error 
handler) 

■ the execution of a floating-point instruction that causes a floating- 
point exception that is enabled (system floating-point enabled excep- 
tion error handler) 

■ the execution of a floating-point instruction that requires system soft- 
ware assistance (system floating-point assist error handler; the condi- 
tions under which such software assistance is required are 
implementation-dependent) 

The exceptions that can be caused by an asynchronous event are 
described in Book III, Chapter 5, "Interrupts," on page 453. 

The invocation of the system error handler is precise, except that if 
one of the imprecise modes for invoking the system floating-point 
enabled exception error handler is in effect (see page 153) then the invo- 
cation of the system floating-point enabled exception error handler may 
be imprecise. When the system error handler is invoked imprecisely, the 
excepting instruction does not appear to complete before the next 
instruction starts (because one of the effects of the excepting instruction, 
namely the invocation of the system error handler, has not yet occurred). 

Additional information about exception handling can be found in 
Book III, Chapter 5. 

1.11 Storage Addressing 

A program references storage using the effective address computed by the 
processor when it executes a Storage Access or Branch instruction (or 
certain other instructions described in Book II, PowerPC Virtual Envi- 
ronment Architecture, and Book III, PowerPC Operating Environment 
Architecture) or when it fetches the next sequential instruction. 

1.11.1 Storage Operands 

Bytes in storage are numbered consecutively starting with 0. Each num- 
ber is the address of the corresponding byte. 
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Storage operands may be bytes, halfwords, words, or doublewords, or, 
for the Load/Store Multiple and Move Assist instructions, a sequence of 
bytes or words. The address of a storage operand is the address of its first 
byte (i.e., of its lowest-numbered byte). Byte ordering is Big-Endian by 
default, but PowerPC can be operated in a mode in which byte ordering 
is Little-Endian. See Appendix D, "Little-Endian Byte Ordering," on 
page 233. 

Operand length is implicit for each instruction. 

The operand of a single-register Storage Access instruction has a "nat- 
ural" alignment boundary equal to the operand length. In other words, 
the "natural" address of an operand is an integral multiple of the oper- 
and length. A storage operand is said to be "aligned" if it is aligned at its 
natural boundary; otherwise it is said to be "unaligned." 

Storage operands for single-register Storage Access instructions have 
the following characteristics. (Although not permitted as storage oper- 
ands, quadwords are shown because quadword alignment is desirable for 
certam storage operands.) 



Operand 


Length 


Addr50:63 if aligned 


Byte 


8 bits 


xxxx 


Halfword 


2 bytes 


xxxO 


Word 


4 bytes 


xxOO 


Doubleword 


8 bytes 


xOOO 


Quadword 


16 bytes 


0000 



Note: An "x" in an address bit position indicates that the bit can be 0 or 1 independent of the 
state of other bits in the address. 



The concept of alignment is also applied more generally, to any datum 
in storage. For example, a 12-byte datum in storage is said to be word- 
aligned if its address is an integral multiple of 4. 

Some instructions require their storage operands to have certain align- 
ments. In addition, alignment may affect performance. For single-register 
Storage Access instructions, the best performance is obtained when stor- 
age operands are aligned. Additional effects of data placement on perfor- 
mance are described in Book II, Chapter 2, "Effect of Operand Placement 
on Performance," on page 339. 

Instructions are always four bytes long and word-aligned. 
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1.11.2 Effective Address Calculation 

The 64- or 32-bit address computed by the processor when executing a 
Storage Access or Branch instruction (or certain other instructions de- 
scribed in Book II, PowerPC Virtual Environment Architecture^ and 
Book III, PowerPC Operating Environment Architecture) or when fetch- 
ing the next sequential instruction is called the "effective address" and 
specifies a byte in storage. For a Storage Access instruction, if the sum of 
the effective address and the operand length exceeds the maximum effec- 
tive address, the storage operand is considered to wrap around from the 
maximum effective address to effective address 0, as described below. 

Effective address computations, for both data and instruction accesses, 
use 64{32}-bit unsigned binary arithmetic regardless of mode. A carry 
from bit 0 is ignored. In a 64-bit implementation, the 64-bit current 
instruction address and next instruction address are not affected by a 
change from 32-bit mode to 64-bit mode, but they are affected by a 
change from 64-bit mode to 32-bit mode (the high-order 32 bits are set to 
0). 

In 64-bit mode, the entire 64-bit result is used as the 64-bit effective 
address. The effective address arithmetic wraps around from the maxi- 
mum address, 2^^-l, to address 0. 

In 32-bit mode, the low-order 32 bits of the 64-bit result are used as 
the effective address for the purpose of addressing storage. The high- 
order 32 bits of the 64-bit effective address are ignored for the purpose of 
accessing data, but are included whenever a 64-bit effective address is 
placed into a GPR by Load with Update and Store with Update instruc- 
tions. The high-order 32 bits of the 64-bit effective address are set to 0 
for the purpose of fetching instructions, and whenever a 64-bit effective 
address is placed into the Link Register by Branch instructions having 
LK=1. The high-order 32 bits of the 64-bit effective address are set to 0 in 
Special Purpose Registers when the system error handler is invoked. As 
used to address storage, the effective address arithmetic appears to wrap 
around from the maximum address, 2^^-l, to address 0. 

A zero in the RA field indicates the absence of the corresponding 
address component. For the absent component, a value of zero is used 
for the address. This is shown in the instruction descriptions as (RAIO). 

In both 64-bit and 32-bit modes, the calculated effective address may 
be modified in its three low-order bits before accessing storage if the 
PowerPC system is operating in Little-Endian mode. See Appendix D, 
"Little-Endian Byte Ordering," on page 233. 

Effective addresses are computed as follows. In the descriptions below, 
it should be understood that "the contents of a GPR" refers to the entire 
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64-bit contents, independent of mode, but that in 32-bit mode only bits 
32:63 of the 64-bit resuk of the computation are used to address storage. 

■ With X-form instructions, in computing the effective address of a data 
element, the contents of the GPR designated by RB are added to the 
contents of the GPR designated by RA or to zero if RA=0. 

■ With D-form instructions, the 16-bit D field is sign-extended to form a 
64-bit address component. In computing the effective address of a 
data element, this address component is added to the contents of the 
GPR designated by RA or to zero if RA=0. 

■ With DS-form instructions, the 14-bit DS field is concatenated on the 
right w^ith ObOO and sign-extended to form a 64-bit address compo- 
nent. In computing the effective address of a data element, this ad- 
dress component is added to the contents of the GPR designated by 
RA or to zero if RA=0. 

■ With I-form Branch instructions, the 24-bit LI field is concatenated on 
the right with ObOO and sign-extended to form a 64-bit address com- 
ponent. If AA=0, this address component is added to the address of 
the branch instruction to form the effective address of the next instruc- 
tion. If AA=1, this address component is the effective address of the 
next instruction. 

■ With B-form Branch instructions, the 14-bit BD field is concatenated 
on the right with ObOO and sign-extended to form a 64-bit address 
component. If AA=0, this address component is added to the address 
of the branch instruction to form the effective address of the next in- 
struction. If AA=1, this address component is the effective address of 
the next instruction. 

■ With XL-form Branch instructions, bits 0:61 of the Link Register or 
the Count Register are concatenated on the right with ObOO to form 
the effective address of the next instruction. 

■ With sequential instruction fetching, the value 4 is added to the ad- 
dress of the current instruction to form the effective address of the 
next instruction. 
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2.1 Branch Processor Overview 

This chapter describes the registers and instructions that make up the 
Branch Processor faciHty. Section 2.3, "Branch Processor Registers," on 
page 32 describes the registers associated with the Branch Processor. Sec- 
tion 2.4, "Branch Processor Instructions," on page 35 describes the 
instructions associated with the Branch Processor. 



2.2 Instruction Fetching 

In general, instructions appear to execute sequentially, in the order in 
which they appear in storage. The exceptions to this rule are listed 
below. 

■ Branch instructions for which the branch is taken cause execution to 
continue at the target address generated by the Branch instruction. 

■ Trap and System Call instructions cause the appropriate system han- 
dler to be invoked. 

■ Exceptions can cause the system error handler to be invoked, as 
described in Section 1.10, "Exceptions," on page 26. 

■ The Return From Interrupt instruction, described in Book III, "Return 
From Interrupt XL-form," on page 379, causes execution to continue 
at the address contained in a Special Purpose Register. 
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Programming Note 

If a program modifies the 
instructions it intends to 
execute, it should call 
the appropriate system 
library program before 
attempting to execute 
the modified instructions, 
to ensure that the 
modifications have taken 
effect with respect to 
instruction fetching. 



In general, from the view of the processor executing the instructions, 
each instruction appears to complete before the next instruction starts. 
For the instructions and faciHties defined in Book I, the only exceptions to 
this rule are the following. 

■ The system error handler is invoked imprecisely. The instruction that 
causes the system error handler to be invoked does not complete 
before the next instruction starts: see Section 1.10, "Exceptions," on 
page 26. 

■ A Store instruction modifies a storage location that contains an 
instruction. Software synchronization is required to ensure that subse- 
quent instruction fetches from that location obtain the modified ver- 
sion of the instruction: see Book II, Section 3.2.1, "Instruction Cache 
Instructions," on page 344. 



2.3 Branch Processor Registers 



2.3.1 Condition Register 

The Condition Register (CR) is a 32-bit register which reflects the result 
of certain operations, and provides a mechanism for testing (and branch- 
ing). 



CR 
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Figure 18. Condition Register 

The bits in the Condition Register are grouped into eight 4-bit fields, 
named CR Field 0 (CRO), CR Field 7 (CR7), which are set in one of 
the following ways: 

■ Specified fields of the CR can be set by a move to the CR from a GPR 
(mtcrf), 

■ A specified field of the CR can be set by a move to the CR from 
another CR field (mcrf)^ from the XER (mcrxr)^ or from the FPSCR 
(mcrfs). 



CR Field 0 can be set as the implicit result of a fixed-point instruction. 
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■ CR Field 1 can be set as the implicit result of a floating-point 
instruction. 

■ A specified CR field can be set as the result of either a fixed-point or a 
floating-point Compare instruction. 

Instructions are provided to perform logical operations on individual 
CR bits and to test individual CR bits. 

For all fixed-point instructions in which Rc=l, and for addic, andi,^ 
and andis., the first three bits of CR Field 0 (bits 0:2 of the Condition 
Register) are set by signed comparison of the result to zero, and the 
fourth bit of CR Field 0 (bit 3 of the Condition Register) is copied from 
the SO field of the XER. "Result" here refers to the entire 64-bit value 
placed into the target register in 64-bit mode, and to bits 32:63 of the 64- 
bit value placed into the target register in 32-bit mode. 

if (64-b1t implementation) & (64-bit mode) 
then M <- 0 
else M <- 32 

if (tdrget_regi ster)|^^.53 < 0 then c <- OblOO 

else if (target_reg1ster)|vi.53 > 0 then c <r- ObOlO 
else ' c <r- ObOOl 

CRO <- c II XERso 

If any portion of the result is undefined, then the value placed into the 
first three bits of CR Field 0 is undefined. 

The bits of CR Field 0 are interpreted as follows. 

Programming Note 

CR Field 0 may not reflect 
the "true" (infinitely 
precise) result if 
overflow occurs: see 
Section 3.3.9, "Fixed- 
Point Arithmetic 
Instructions," on page 81. 



3 Summary Overflow (SO) 

This is a copy of the final state of XERgo at the completion of the 
instruction. 

The fixed-point instructions stwcx, and stdcx. also set CR Field 0. 
For all floating-point instructions in v^hich Rc=l, CR Field 1 (bits 4:7 
of the Condition Register) is set to the Floating-Point exception status, 



Bit Description 

0 Negative (LT) 
The result is negative. 

1 Positive (GT) 
The result is positive. 

2 Zero (EQ) 
The result is zero. 
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copied from bits 0:3 of the Floating-Point Status and Control Register. 
These bits are interpreted as follows. 

Bit Description 

4 Floating-Point Exception Summary (FX) 

This is a copy of the final state of FPSCRpx at the completion of 
the instruction. 

5 Floating-Point Enabled Exception Summary (FEX) 

This is a copy of the final state of FPSCRpEx completion of 
the instruction. 

6 Floating-Point Invalid Operation Exception Summary (VX) 
This is a copy of the final state of FPSCRyx completion of 
the instruction. 

7 Floating-Point Overflow Exception (OX) 

This is a copy of the final state of FPSCRqx completion of 

the instruction. 

For Compare instructions, a specified CR field is set to reflect the 
result of the comparison. The bits of the specified CR field are inter- 
preted as follows. A complete description of how the bits are set is given 
in the instruction descriptions in Section 3.3.10, "Fixed-Point Compare 
Instructions," on page 98, and Section 4.6.7, "Floating-Point Compare 
Instructions," on page 191. 

Bit Description 

0 Less Than, Floating-Point Less Than (LT, FL) 

For fixed-point Compare instructions, (RA) < SI or (RB) (signed 
comparison) or (RA) ^ UI or (RB) (unsigned comparison). For 
floating-point Compare instructions, (FRA) < (FRB). 

1 Greater Than, Floating-Point Greater Than (GT, FG) 

For fixed-point Compare instructions, (RA) > SI or (RB) (signed 
comparison) or (RA) UI or (RB) (unsigned comparison). For 
floating-point Compare instructions, (FRA) > (FRB). 

2 Equal, Floating-Point Equal (EQ, FE) 

For fixed-point Compare instructions, (RA) = SI, UI, or (RB). For 
floating-point Compare instructions, (FRA) = (FRB). 
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3 Summary Overflow, Floating-Point Unordered (SO, FU) 

For fixed-point Compare instructions, this is a copy of the final 
state of XER^o the completion of the instruction. For floating- 
point Compare instructions, one or both of (ERA) and (ERB) is a 
NaN. 

2.3.2 Link Register 

The Link Register (LR) is a 64-bit {32-bit} register. It can be used to pro- 
vide the branch target address for the Branch Conditional to Link Regis- 
ter instruction, and it holds the return address after Branch and Link 
instructions. 



LR 


0 




63{31} 



Figure 19. Link Register 



2.3.3 Count Register 

The Count Register (CTR) is a 64-bit {32-bit} register. It can be used to 
hold a loop count that can be decremented during execution of Branch 
instructions that contain an appropriately coded BO field. If the value in 
the Count Register is 0 before being decremented, it is -1 afterward. The 
Count Register can also be used to provide the branch target address for 
the Branch Conditional to Count Register instruction. 



CTR 


0 




63(31} 



Figure 20. Count Register 



2.4 Branch Processor Instructions 
2.4.1 Branch instructions 

The sequence of instruction execution can be changed by the Branch 
instructions. Because all instructions are on w^ord boundaries, bits 62 
and 63 of the generated branch target address are ignored by the proces- 
sor in performing the branch. 
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The Branch instructions compute the effective address (EA) of the tar- 
get in one of the following four ways, as described in Section 1.11.2, 
"Effective Address Calculation," on page 29. 

1 . Adding a displacement to the address of the branch instruction 
(Branch or Branch Conditional with AA=0). 

2. Specifying an absolute address (Branch or Branch Conditional with 
AA=1). 

3. Using the address contained in the Link Register (Branch Conditional 
to Link Register). 

4. Using the address contained in the Count Register (Branch Condi- 
tional to Count Register), 

In all four cases, in 32-bit mode of 64-bit implementations, the final 
step in the address computation is setting the high-order 32 bits of the 
target address to 0. 

For the first two methods, the target addresses can be computed suffi- 
ciently ahead of the branch instruction that instructions can be prefetched 
along the target path. For the third and fourth methods, prefetching 
instructions along the target path is also possible, provided the Link Reg- 
ister or the Count Register is loaded sufficiently ahead of the branch 
instruction. 

Branching can be conditional or unconditional, and the return address 
can optionally be provided. If the return address is to be provided 
(LK=1), the effective address of the instruction following the branch 
instruction is placed into the Link Register after the branch target address 
has been computed. This is done whether or not the branch is taken. 

In Branch Conditional instructions, the BO field specifies the condi- 
tions under which the branch is taken. The first four bits of the BO field 
specify how the branch is affected by or affects the Condition Register 
and the Count Register. The fifth bit, shown below as having the value 
"y", may be used by some implementations as described below. 

The encoding for the BO field is as follows. Here M=0 in 64-bit mode 
and M=32 in 32-bit mode. If the BO field specifies that the CTR is to be 
decremented, the entire 64-bit CTR is decremented regardless of the 
mode. 
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BO Description 



OOOOy Decrement the CTR, then branch if the decremented 
and the condition is FALSE. 

000 ly Decrement the CTR, then branch if the decremented CTRj^.53=0 
and the condition is FALSE. 

OOlzy Branch if the condition is FALSE. 

OlOOy Decrement the CTR, then branch if the decremented CTRjvi.63?J:0 
and the condition is TRUE. 

OlOly Decrement the CTR, then branch if the decremented CTRm:63=0 
and the condition is TRUE. 

Ollzy Branch if the condition is TRUE. 

IzOOy Decrement the CTR, then branch if the decremented CTRj^.539tO. 
IzOly Decrement the CTR, then branch if the decremented CTR2^.53=0. 
Izlzz Branch always. 

Above, "z" denotes a bit that is ignored. 

The "y" bit provides a hint about whether a conditional branch is 
likely to be taken, and may be used by some implementations to improve 
performance. 

The "branch always" encoding of the BO field does not have a "y" bit. 
For Branch Conditional instructions that have a "y" bit, using y=0 
indicates that the following behavior is likely. 

■ If the instruction is bc[l][a] with a negative value in the displacement 
field, the branch is taken. 

■ In all other cases (bc[l][a] with a nonnegative value in the displace- 
ment field, bclr[l]y or bcctr[l]), the branch falls through (is not taken). 

Using y=l reverses the preceding indications. 

The displacement field is used as described above even if the target is 
an absolute address. 



Programming Note 

The "z" bits should be set 
to 0, as they may be 
assigned a meaning in 
some future version of 
the architecture. 

The default value for the 
"y" bit should be 0: the 
value 1 should be used 
only if software has 
determined that the 
prediction corresponding 
to y=1 is more likely to be 
correct than the 
prediction corresponding 
to y=0. 



Extended mnemonics for branches 

Many extended mnemonics are provided so that Branch Conditional 
instructions can be coded with the condition as part of the instruction 
mnemonic rather than as a numeric operand. Some of these are shown as 
examples with the Branch instructions. See Appendix C, "Assembler 
Extended Mnemonics," on page 215 for additional extended mnemonics. 
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Programming Note 

In some implementations, 
the processor may keep a 
stack of the 

Link Register values most 
recently set by 
Branch and Link 
instructions, with the 
possible exception of 
the form shown below 
for obtaining the address 
of the next instruction. To 
benefit from this stack, 
the following 
programming 
conventions should be 
used. 

Let A, B, and Glue be 
programs. 

■ Obtaining the address 
of the next instruction: 

Use the following form 
of Branch and Link. 

bcl 20. 31, $+4 

■ Loop counts: 

Keep them in the 
Count Register and use 
one of the Branch 
Conditional 
instructions to 
decrement the count 
and to control 
branching (e.g., 
branching back to the 
start of a loop if the 
decremented counter 
value is nonzero). 

■ Computed goto's, case 
statements, etc.: 

Use the Count Register 
to hold the address to 
branch to, and use the 
bcctr instruction (LK=0) 
to branch to the 
selected address. 

■ Direct subroutine 
linkage: 



Branch l-form 



b 

ba 
bl 
bla 



target_addr 
target_addr 
target_addr 
target_addr 



(AA=0 LK=0) 
(AA=1 LK=0) 
(AA=0 LK=:1) 
(AA=1 LK=1) 



18 



0 



LI 



AALK 
3031 



if AA then NIA <^,ea EXTS(LI || ObOO) 
else NIA ^,ea CIA + EXTSCLI | | 

if LK then LR ^.-ea CIA + 4 



ObOO) 



target _addr specifies the branch target address. 

If AA=0 then the branch target address is the sum of LI II ObOO sign- 
extended and the address of this instruction, with the high-order 32 bits 
of the branch target address set to 0 in 32-bit mode of 64-bit implementa- 
tions. 

If AA=1 then the branch target address is the value LI II ObOO sign- 
extended, with the high-order 32 bits of the branch target address set to 0 
in 32-bit mode of 64-bit implementations. 

If LK=1 then the effective address of the instruction following the 
Branch instruction is placed into the Link Register. 



Special Registers Altered 

LR 

Branch Conditional B-form 



be 
bca 
bcl 
bcla 



BO,BI,target_addr 
BO,BI,target_addr 
BO,BI,target_addr 
BO,BI,target_addr 



(if LK=1) 



(AA=0 LK=0) 
(AA=1 LK=0) 
(AA=0 LK=1) 
(AA=1 LK=1) 





16 


BO 


BI 




BD 


0 




6 


11 


16 





AALK 
3031 



if (64-bit implementation) & (64-bit mode; 

then M 0 

else M ^ 32 
1f -1BO2 then CTR ^ CTR - 1 
ctr_ok ^ BO2 I ((CTRm:63 ^ 0) © BO3) 
cond_ok ^ BOq | (CRbi *= BOi) 
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if ctr_ok & concl_ok then 

if AA then NIA EXTS(BD || ObOO) 

else NIA <^,ea CIA + EXTS(BD || ObOO) 

1f LK then LR ^^ea CIA + 4 

The BI field specifies the bit in the Condition Register to be used as the 
condition of the branch. The BO field is used as described above, tar- 
get _addr specifies the branch target address. 

If AA=0 then the branch target address is the sum of BD II ObOO sign- 
extended and the address of this instruction, with the high-order 32 bits 
of the branch target address set to 0 in 32-bit mode of 64-bit implementa- 
tions. 

If AA=1 then the branch target address is the value BD II ObOO sign- 
extended, with the high-order 32 bits of the branch target address set to 0 
in 32-bit mode of 64-bit implementations. 

If LK=1 then the effective address of the instruction following the 
Branch instruction is placed into the Link Register. 



Special Registers Altered 

CTR 
LR 

Extended Mnemonics: 

Examples of extended mnemonics for Branch Conditional: 

Equivalent to: 
be 12,0, target 
be 4,10,target 
be 16,0,target 



(if B02=0) 
(if LK=1) 



Extended: 
bit target 
bne cr2,target 
bdnz target 



Branch Conditional to Linic Register XL-form 



bclr 
bclrl 



BO,BI 
BO,BI 



(LK=0) 
(LK=1) 



[POWER mnemonics: bcr, bcrl] 



19 


BO 


BI 


III 


16 


LK 


0 


6 


11 


16 


21 


31 



1f (64-b1t implementation) & (64-bit mode) 

then M <- 0 

else M ^ 32 
if ^B02 then CTR ^ CTR - 1 



Here A calls B and B 
returns to A. The two 
branches should be as 
follows. 

— A calls B: use a 
Branch instruction 
that sets the Link 
Register (LK=1). 

— B returns to A; use 
the bc/r instruction 
(LK=0) (the return 
address is in, or can 
be restored to, the 
Link Register). 

Indirect subroutine 
linkage: 

Here A calls Glue, Glue 
calls B, and B returns to 
A rather than to Glue. 
(Such a calling 
sequence is common 
in linkage code used 
when the subroutine 
that the programmer 
wants to call, here B, is 
in a different module 
from the caller: the 
Binder inserts "glue" 
code to mediate the 
branch.) The three 
branches should be as 
follows. 

— A calls Glue: use a 
Branch instruction 
that sets the Link 
Register (LK=1). 

—Glue calls B: place 
the address of B in 
the Count Register, 
and use the bcctr 
instruction (LK=0). 

— B returns to A: use 
the fec/r instruction 
(LK=0) (the return 
address is in, or can 
be restored to, the 
Link Register). 
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ctr_ok ^ BO2 I ((CTRh:63 ^ 0) e BO3) 
cond_ok <- BOq | (CRbi = BOi) 

1f ctr_ok & concl_ok then NIA <-iea 1-^0:61 II ObOO 
if LK then LR ^.ea CIA + 4 

The BI field specifies the bit in the Condition Register to be used as the 
condition of the branch. The BO field is used as described above, and the 
branch target address is LRq.^i II ObOO, with the high-order 32 bits of the 
branch target address set to 0 in 32-bit mode of 64-bit implementations. 

If LK=1 then the effective address of the instruction follov^ing the 
Branch instruction is placed into the Link Register. 

Special Registers Altered 

CTR (if B02=0) 

LR (if LK=1) 



Extended Mnemonics: 

Examples of extended mnemonics for Branch Conditional to Link Regis- 
ter: 



Extended: 
bltlr 

bnelr cr2 
bdnzlr 



Equivalent to: 
bclr 12,0 
bclr 4,10 
bclr 16,0 



Branch Conditional to Count Register XL-form 



bcctr 
bcctrl 



BO,BI 
BO,BI 



[POWER mnemonics: bcc, bccl] 



(LK=0) 
(LK=1) 



19 


BO 


BI 


III 


528 


LK 


0 


6 


11 


16 


21 


31 



cond_ok <- BOq | (CRbi = BOi) 

if cond_ok then NIA CTRq-gi M ObOO 

if LK then LR <r-,,, CIA + 4 

The BI field specifies the bit in the Condition Register to be used as the 
condition of the branch. The BO field is used as described above, and the 
branch target address is CTRqj^i 1 1 ObOO, with the high-order 32 bits of 
the branch target address set to 0 in 32-bit mode of 64-bit implementa- 
tions. 
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If LK=1 then the effective address of the instruction following the 
Branch instruction is placed into the Link Register. 

If the "decrement and test CTR" option is specified (BO2=0), the 
instruction form is invalid. 



Special Registers Altered 

LR 



(if LK=1) 



Extended Mnemonics: 

Examples of extended mnemonics for Branch Conditional to Count Reg- 



ister: 



Extended: 
bltctr 
bnectr cr2 



Equivalent to: 
bcctr 12,0 
bcctr 4,10 



2.4.2 System Call Instruction 

This instruction provides the means by which a program can call upon 
the system to perform a service. 

System Call SC-form 

sc 

[POWER mnemonic: svca] 





17 


III 


III 


0 




6 


11 



/// 



16 



3031 



This instruction calls the system to perform a service. A complete 
description of this instruction can be found in Book III, Section 2.3.1, 
"System Linkage Instructions," on page 378. 

When control is returned to the program that executed the System 
Call, the content of the registers vvrill depend on the register conventions 
used by the program providing the system service. 

This instruction is context synchronizing; see Book III, Section 1.7.1, 
"Context Synchronization," on page 371. 

Special Registers Altered 

Dependent on the system service. 



Compatibility Note 

For a discussion of Power 
compatibility with 
respect to instruction 
bits 16:29, please refer to 
Appendix G, 
"Incompatibilities with 
the POWER 
Architecture/' on 
page 271. For 
compatibility with future 
versions of this 
architecture, these bits 
should be coded as zero. 
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2.4.3 Condition Register Logical instructions 

Extended mnemonics for Condition Register logical 
operations 

A set of extended mnemonics is provided that allow Condition Register 
logical operations, beyond those provided by the basic Condition Regis- 
ter Logical instructions, to be coded easily. Some of these are shown as 
examples with the Condition Register Logical instructions. See Appen- 
dix C, "Assembler Extended Mnemonics," on page 215 for additional 
extended mnemonics. 



Condition Register AND XL-form 

crand BT,BA,BB 



19 


BT 


BA 


BB 


257 


/ 


0 


6 


11 


16 


21 


31 



CRbt ^ CRba & CRbb 

The bit in the Condition Register specified by BA is ANDed with the 
bit in the Condition Register specified by BB, and the result is placed into 
the bit in the Condition Register specified by BT. 

Special Registers Altered 

CR 



Condition Register OR XL-form 

cror BT,BA,BB 



19 


BT 


BA 


BB 


449 


/ 


0 


6 


11 


16 


21 


31 



CRbt ^ CRba I CRbb 

The bit in the Condition Register specified by BA is ORed with the bit 
in the Condition Register specified by BB, and the result is placed into the 
bit in the Condition Register specified by BT. 

Special Registers Altered 

CR 
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Extended Mnemonics: 

Example of extended mnemonics for Condition Register OR: 

Extended: Equivalent to: 

crmove Bx,By cror Bx,By,By 



Condition Register XOR XL-form 

crxor BT,BA,BB 
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CRbt CRba © CRbb 

The bit in the Condition Register specified by BA is XORed with the 
bit in the Condition Register specified by BB, and the resuh is placed into 
the bit in the Condition Register specified by BT. 

Special Registers Altered 

CR 

Extended Mnemonics: 

Example of extended mnemonics for Condition Register XOR: 

Extended: Equivalent to: 

crclr Bx crxor Bx,Bx,Bx 



Condition Register NAND XL-form 

crnand BT,BA,BB 
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CRbt ^ -(CRba & CRbb) 

The bit in the Condition Register specified by BA is ANDed with the 
bit in the Condition Register specified by BB, and the complemented 
result is placed into the bit in the Condition Register specified by BT. 

Special Registers Altered 

CR 
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Condition Register NOR XL-form 

crnor BT,BA,BB 
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CRbt -(CRba I CRbb) 

The bit in the Condition Register specified by BA is ORed with the bit 
in the Condition Register specified by BB, and the complemented result is 
placed into the bit in the Condition Register specified by BT. 

Special Registers Altered 

CR 

Extended iVinemonics: 

Example of extended mnemonics for Condition Register NOR: 

Extended: Equivalent to: 

crnot Bx,By crnor Bx,By,By 



Condition Register Equivalent XL-form 

creqv BT,BA,BB 
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CRbt <- CRba = CRbb 

The bit in the Condition Register specified by BA is XORed with the 
bit in the Condition Register specified by BB, and the complemented 
result is placed into the bit in the Condition Register specified by BT. 

Special Registers Altered 

CR 

Extended iVinemonics: 

Example of extended mnemonics for Condition Register Equivalent: 

Extended: Equivalent to: 

crset Bx creqv BxjBx^Bx 
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Condition Register AND with Complement XL-form 

crandc BT,BA,BB 
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CRbt ^ CRba & -CRbb 

The bit in the Condition Register specified by BA is ANDed with the 
complement of the bit in the Condition Register specified by BB, and the 
result is placed into the bit in the Condition Register specified by BT. 

Special Registers Altered 

CR 



Condition Register OR with Complement XL-form 

crorc BT,BA,BB 
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CRbt ^ CRba I -'CRbb 

The bit in the Condition Register specified by BA is ORed with the 
complement of the bit in the Condition Register specified by BB, and the 
result is placed into the bit in the Condition Register specified by BT. 



Special Registers Altered 

CR 
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2.4.4 Condition Register Field instruction 



Move Condition Register Field XL-form 

mcrf BF,BFA 
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CR4xBF:4xBF+3 ^ C R4XBFA : 4xBFA+3 

The contents of Condition Register field BFA are copied into Condi- 
tion Register field BF. 

Special Registers Altered 

CR 
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3.1 Fixed-Point Processor Overview 

This chapter describes the registers and instructions that make up the 
Fixed-Point Processor facility. Section 3.2, "Fixed-Point Processor Regis- 
ters," describes the registers associated with the Fixed-Point Processor. 
Section 3.3, "Fixed-Point Processor Instructions," on page 49 describes 
the instructions associated with the Fixed-Point Processor. 

3.2 Fixed-Point Processor Registers 
3.2.1 General Purpose Registers 

All manipulation of information is done in registers internal to the Fixed- 
Point Processor. The principal storage internal to the Fixed-Point Proces- 
sor is a set of 32 general purpose registers (GPRs). See Figure 21. 
Each GPR is a 64-bit {32-bit} register. 



GPROO 
GPR 01 



GPR 30 

GPR 31 

0 63 {31} 



Figure 21 . General purpose registers 
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3.2.2 Fixed-Point Exception Register 

The Fixed-Point Exception Register (XER) is a 32-bit register. 



XER 

0 31 



Figure 22. Fixed-Point Exception Register 

The bit definitions for the Fixed-Point Exception Register are as 
shown below. Here M=0 in 64-bit mode and M=32 in 32-bit mode. 

The bits are set based on the operation of an instruction considered as 
a whole, not on intermediate results (e.g., the Subtract From Carrying 
instruction, the result of which is specified as the sum of three values, sets 
bits in the Fixed-Point Exception Register based on the entire operation, 
not on ail intermediate sum). 

Bit(s) Description 

0 Summary Overflow (SO) 

The Summary Overflow bit is set to 1 whenever an instruction (ex- 
cept mtspr) sets the Overflow bit. Once set, the SO bit remains set 
until it is cleared by an mtspr instruction (specifying the XER) or 
an mcrxr instruction. It is not altered by Compare instructions, nor 
by other instructions (except mtspr to the XER, and mcrxr) that 
cannot overflow. Executing an mtspr instruction to the XER, sup- 
plying the values 0 for SO and 1 for OV, causes SO to be set to 0 
and OV to be set to 1. 

1 Overflow (OV) 

The Overflow bit is set to indicate that an overflow has occurred 
during execution of an instruction. XO-form Add, Subtract From, 
and Negate instructions having OE=i set it to 1 if the carry out of 
bit M is not equal to the carry out of bit M+1, and set it to 0 oth- 
erwise. XO-form Multiply Low and Divide instructions having 
OE=l set it to 1 if the result cannot be represented in 64 bits 
[mulld, divd, divdu) or in 32 bits [mullw, divw, divwu), and set it 
to 0 otherwise. The OV bit is not altered by Compare instructions, 
nor by other instructions (except mtspr to the XER, and mcrxr) 
that cannot overflow. 

2 Carry (CA) 

The Carry bit is set as follows, during execution of certain instruc- 
tions. Add Carrying, Subtract From Carrying, Add Extended, and 
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Subtract From Extended instructions set it to 1 if there is a carry 
out of bit M, and set it to 0 otherwise. Shift Right Algebraic in- 
structions set it to 1 if any 1-bits have been shifted out of a negative 
operand, and set it to 0 otherwise. The CA bit is not altered by 
Compare instructions, nor by other instructions (except Shift Right 
Algebraic, mtspr to the XER, and mcrxr) that cannot carry. 



3:24 Reserved 



25:31 This field specifies the number of bytes to be transferred by a Load 
String Indexed or Store String Indexed instruction. 



3.3 Fixed-Point Processor instructions 



Compatibility Note 

For a discussion of 
POWER compatibility 
with respect to XER bits 
16:23, please refer to 
Appendix G, 
"Incompatibilities with 
the POWER 
Architecture," on 
page 271. For 
compatibility with future 
versions of this 
architecture, these bits 
should be set to zero. 



This section describes the instructions executed by the Fixed-Point pro- 
cessor. 



3.3.1 storage Access Instructions 



The Storage Access instructions compute the effective address (EA) of the 
storage to be accessed as described in Section 1.11.2, "Effective Address 
Calculation," on page 29. 

The order of bytes accessed by halfv^ord, word, and doubleword loads 
and stores is Big-Endian, unless Little-Endian storage ordering is selected 
as described in Appendix D, "Little-Endian Byte Ordering," on page 233. 

Storage Access Exceptions 

Storage accesses will cause the system error handler to be invoked if the 
program is not allowed to modify the target storage (Store only) or if the 
program attempts to access storage that is unavailable. 



Programming Note 

The "la" extended 
mnemonic permits 
computing an effective 
address as a Load or Store 
instruction would, but 
loads the address itself 
into a GPR rather than 
loading the value that is 
in storage at that 
address. This extended 
mnemonic is described in 
"Load Address," on 
page 232. 



3.3.2 Fixed-Point Load instructions 

The byte, halfword, w^ord, or doubleword in storage addressed by EA is 
loaded into register RT. 

Byte order of PowerPC is Big-Endian by default; see Appendix D, "Lit- 
tle-Endian Byte Ordering," on page 233 for PowerPC systems operated 
with Little-Endian byte ordering. 

Many of the Load instructions have an "update" form, in which regis- 
ter RA is updated with the effective address. For these forms, if RA^O 
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Programming Note 

In some 

implementations, the 
Load Algebraic and Load 
with Update instructions 
may have greater latency 
than other types of Load 
instructions. Moreover, 
Load with Update 
instructions may take 
longer to execute in 
some implementations 
than the corresponding 
pair of a non-update 
Load instruction and an 
Add instruction. 



and RAt^^RT, the effective address is placed into register RA and the stor- 
age element (byte, halfword, word, or doubleword) addressed by EA is 
loaded into RT. 

Load Byte and Zero D-f orm 

Ibz RT,D(RA) 



1 ^ 


34 


RT 


RA 




D 








6 


11 


16 




31 



if RA = 0 then b <- 0 
else b <- (RA) 

EA ^ b + EXTS(D) 
RT <r- II MEM(EA, 1) 

Let the effective address (EA) be the sum (RAI0)+D. The byte in stor- 
age addressed by EA is loaded into RT55.53. RTo:55 are set to 0. 

Special Registers Altered 

None 



Load Byte and Zero Indexed X-form 

Ibzx RT,RA,RB 
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RB 
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if RA = 0 then b f- 0 
else b <- (RA) 

EA 4- b + (RB) 
RT 4- ^^0 II MEM(EA. 1) 

Let the effective address (EA) be the sum (RAI0)+(RB). The byte in 
storage addressed by EA is loaded into RT55.53. RTo:55 are set to 0. 

Special Registers Altered 

None 
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Load Byte and Zero with Update D-f orm 

Ibzu RT,D(RA) 
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EA 4- (RA) + EXTS(D) 
RT <r- ^^0 II MEM(EA, 1) 
RA <r- EA 

Let the effective address (EA) be the sum (RA)+D. The byte in storage 
addressed by EA is loaded into RT55.53. RTo:55 are set to 0. 
EA is placed into register RA. 
If RA=0 or RA=RT, the instruction form is invalid. 

Special Registers Altered 

None 



Load Byte and Zero with Update indexed X-form 

Ibzux RT,RA,RB 



31 
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EA <- (RA) + (RB) 

RT <r- ^^0 II MEM(EA. 1) 

RA <- EA 

Let the effective address (EA) be the sum (RA)+(RB). The byte in stor- 
age addressed by EA is loaded into RT55.53. RT0.55 are set to 0. 
EA is placed into register RA. 
If RA=0 or RA=RT, the instruction form is invalid. 

Special Registers Altered 

None 
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Load Halfword and Zero D-form 

Ihz RT,D(RA) 
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if RA = 0 then b <- 0 
else b ^ (RA) 

EA ^ b + EXTS(D) 
RT ^ II MEM(EA, 2) 

Let the effective address (EA) be the sum (RAIO)+D. The halfword in 
storage addressed by EA is loaded into RT48.53. RT0.47 are set to 0. 

Special Registers Altered 

None 



Load Halfword and Zero Indexed X-form 

Ihzx RT,RA,RB 
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if RA = 0 then b ^ 0 
else b <- (RA) 

EA <- b + (RB) 
RT ^ II MEM(EA, 2) 

Let the effective address (EA) be the sum (RAI0)+(RB). The halfword 
in storage addressed by EA is loaded into RT43.53. RT0.47 are set to 0. 

Special Registers Altered 

None 
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Load Halfword and Zero with Update D-form 

Ihzu RT,D(RA) 
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EA <r- (RA) + EXTS(D) 
RT <- ^^0 II MEM(EA, 2) 
RA <- EA 

Let the effective address (EA) be the sum (RA)+D. The halfword in 
storage addressed by EA is loaded into RT4g.63. RTo:47 are set to 0. 
EA is placed into register RA. 
If RA=0 or RA=RT, the instruction form is invalid. 

Special Registers Altered 

None 



Load Halfword and Zero with Update indexed X-form 

Ihzux RT,RA,RB 
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EA ^ (RA) + (RB) 

RT <- 48o II MEM(EA, 2) 

RA <- EA 

Let the effective address (EA) be the sum (RA)+(RB). The halfword in 
storage addressed by EA is loaded into RT4g.53. RT0.47 are set to 0. 
EA is placed into register RA. 
If RA=0 or RA=RX the instruction form is invalid. 

Special Registers Altered 

None 
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Load Halfword Algebraic D-form 

lha RT,D(RA) 
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if RA = 0 then b <- 0 
else b (RA) 

EA ^ b + EXTS(D) 
RT <r- EXTS(MEM(EA, 2)) 

Let the effective address (EA) be the sum (RAI0)+D. The halfword in 
storage addressed by EA is loaded into RT48.63. RTo:47 are filled with a 
copy of bit 0 of the loaded halfword. 

Special Registers Altered 

None 



Load Halfword Algebraic Indexed X-form 

lhax RT,RA,RB 
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if RA = 0 then b ^ 0 
else b <- (RA) 

EA <- b + (RB) 
RT <- EXTS(MEM(EA, 2)) 

Let the effective address (EA) be the sum (RAI0)+(RB). The halfword 
in storage addressed by EA is loaded into RT48.53. RTo:47 are filled with 
a copy of bit 0 of the loaded halfword. 

Special Registers Altered 

None 



Load Halfword Algebraic with Update D-form 

lhau RT,D(RA) 
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EA ^ (RA) + EXTS(D) 
RT ^ EXTS(MEM(EA, 2)) 
RA <r- EA 

Let the effective address (EA) be the sum (RA)+D. The halfword in 
storage addressed by EA is loaded into RT4g.53. RTo:47 are filled with a 
copy of bit 0 of the loaded halfword. 

EA is placed into register RA. 

If RA=0 or RA=RT, the instruction form is invalid. 

Special Registers Altered 

None 



Load Halfword Algebraic with Update indexed X-form 

lhaux RT,RA,RB 
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EA ^ (RA) + (RB) 

RT ^ EXTS(MEM(EA. 2)) 

RA ^ EA 

Let the effective address (EA) be the sum (RA)+(RB). The halfword in 
storage addressed by EA is loaded into RT4g.53. RT0.47 are filled with a 
copy of bit 0 of the loaded halfword. 

EA is placed into register RA. 

If RA=0 or RA=RT, the instruction form is invalid. 

Special Registers Altered 

None 



Load Word and Zero D-f orm 

Iwz RT,D(RA) 
[Power mnemonic: 1] 
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if RA = 0 then b ^ 0 
else b <- (RA) 

EA <- b + EXTS(D) 
RT ^ II MEM(EA, 4) 
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Let the effective address (EA) be the sum (RAIO)+D. The word in stor- 
age addressed by EA is loaded into RT32:63. RTo:3i are set to 0. 

Special Registers Altered 

None 



Load Word and Zero indexed X-form 

Iwzx RT,RA,RB 
[Power mnemonic: Ix] 
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if RA = 0 then b «- 0 
else b «- (RA) 

EA <- b + (RB) 
RT <r- II MEM(EA, 4) 

Let the effective address (EA) be the sum (RAIO)+(RB). The word in 
storage addressed by EA is loaded into RT32:63. RTo:3i are set to 0. 

Special Registers Altered 

None 



Load Word and Zero with Update D-form 

Iwzu RT,D(RA) 
[Power mnemonic: lu] 
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EA <- (RA) + EXTS(D) 
RT <r- II MEM(EA, 4) 
RA <- EA 

Let the effective address (EA) be the sum (RA)+D. The word in storage 
addressed by EA is loaded into RT32:53. RTo:3i are set to 0. 
EA is placed into register RA. 
If RA=0 or RA=RT, the instruction form is invalid. 

Special Registers Altered 

None 
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Load Word and Zero with Update Indexed X-form 

Iwzux RT,RA,RB 
[Power mnemonic: lux] 
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EA ^ (RA) + (RB) 

RT <- II MEM(EA, 4) 

RA <- EA 

Let the effective address (EA) be the sum (RA)+(RB). The word in 
storage addressed by EA is loaded into RT32:63. RTqjSi are set to 0. 
EA is placed into register RA. 
If RA=0 or RA=RT, the instruction form is invalid. 

Special Registers Altered 

None 



Load Word Algebraic DS-form 

Iwa RT,DS(RA) 
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if RA = 0 then b <- 0 
else b <- (RA) 

EA <- b + EXTS(DSllObOO) 
RT <r- EXTS(MEM(EA. 4)) 

Let the effective address (EA) be the sum (RAIO)+(DS||ObOO). The 
word in storage addressed by EA is loaded into RT32:63. RTo:3i are filled 
with a copy of bit 0 of the loaded word. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 
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Load Word Algebraic Indexed X-form 

Iwax RT,RA,RB 
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if RA = 0 then b «- 0 
else b <- (RA) 

EA ^ b + (RB) 
RT <r- EXTS(MEM(EA, 4)) 

Let the effective address (EA) be the sum (RAIO)+(RB). The word in 
storage addressed by EA is loaded into RT32:53. RTo:3i are filled with a 
copy of bit 0 of the loaded word. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32- bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 



Load Word Algebraic with Update Indexed X-form 

Iwaux RT,RA,RB 
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EA ^ (RA) + (RB) 

RT EXTS(MEM(EA, 4)) 

RA f- EA 

Let the effective address (EA) be the sum (RA)+(RB). The word in 
storage addressed by EA is loaded into RT32:53. RTo:3i are filled with a 
copy of bit 0 of the loaded word. 

EA is placed into register RA. 

If RA=0 or RA=RT, the instruction form is invalid. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 
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Load Doubleword DS-form 

Id RT,DS(RA) 
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if RA = 0 then b ^ 0 
else b <- (RA) 

EA <- b + EXTS(DSIlObOO) 
RT «- MEM(EA, 8) 

Let the effective address (EA) be the sum (RAIO)+(DS||ObOO). The dou- 
bleword in storage addressed by EA is loaded into RT. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 



Load Doubleword Indexed X-form 

Idx RT,RA,RB 
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if RA = 0 then b <- 0 
else b <r- (RA) 

EA «- b + (RB) 
RT <- MEM(EA. 8) 

Let the effective address (EA) be the sum (RAIO)+(RB). The double- 
word in storage addressed by EA is loaded into RT. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 
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Load Doubleword with Update DS-form 

Idu RT,DS(RA) 
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EA (RA) + EXTS(DSllObOO) 
RT <- MEM(EA, 8) 
RA <- EA 

Let the effective address (EA) be the sum (RA)+(DS||ObOO). The dou- 
bleword in storage addressed by EA is loaded into RT. 

EA is placed into register RA. 

If RA=0 or RA=RT, the instruction form is invalid. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 



Load Doubleword with Update indexed X-form 

Idux RT,RA,RB 
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EA <- (RA) + (RB) 
RT <- MEM(EA, 8) 
RA <- EA 

Let the effective address (EA) be the sum (RA)+(RB), The doubleword 
in storage addressed by EA is loaded into RT. 

EA is placed into register RA. 

If RA=0 or RA=RT, the instruction form is invalid. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 
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3.3.3 Fixed-Point Store instructions 

The contents of register RS are stored into the byte, halfword, word, or 
doubleword in storage addressed by EA. 

Byte order of PowerPC is Big-Endian by defauh; see Appendix D, "Lit- 
tle-Endian Byte Ordering," on page 233 for PowerPC systems operated 
with Little-Endian byte ordering. 

Many of the Store instructions have an "update" form, in which regis- 
ter RA is updated with the effective address. For these forms, the follow- 
ing rules apply. 

■ If RA?tO, the effective address is placed into register RA. 

■ If RS=RA, the contents of register RS are copied to the target storage 
element, and then EA is placed into RA (RS). 



Store Byte D-f orm 

stb RS,D(RA) 
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if RA = 0 then b <- 0 
else b <- (RA) 

EA <- b + EXTS(D) 
MEM(EA, 1) ^ (RS)56:63 

Let the effective address (EA) be the sum (RAI0)+D. (RS)55.53 are 
stored into the byte in storage addressed by EA. 

Special Registers Altered 

None 



Store Byte Indexed X-form 

stbx RS,RA,RB 
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if RA = 0 then b ^ 0 
else b <- (RA) 

EA <- b + (RB) 
MEM(EA, 1) <r- (RS)56.63 
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Let the effective address (EA) be the sum (RAI0)+(RB). (RS)55.53 are 
stored into the byte in storage addressed by EA. 

Special Registers Altered 

None 



Store Byte with Update D-form 

stbu RS,D(RA) 
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EA <- (RA) + EXTS(D) 
MEM(EA, 1) ^ (RS)56:63 
RA <- EA 

Let the effective address (EA) be the sum (RA)+D. (RS)56.63 are 
stored into the byte in storage addressed by EA. 
EA is placed into register RA. 
If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 



Store Byte with Update Indexed X-form 

stbux RS,RA,RB 
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EA <- (RA) + (RB) 
MEM(EA, 1) <- (RS)56.63 
RA <- EA 

Let the effective address (EA) be the sum (RA)+(RB). (RS)55.63 are 
stored into the byte in storage addressed by EA. 
EA is placed into register RA. 
If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 
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Store Halfword D-form 

sth RS,D(RA) 



44 


RS 


RA 




D 




0 


6 


11 


16 




31 



if RA = 0 then b ^ 0 
else b <r- (RA) 

EA ^ b + EXTS(D) 
MEM(EA. 2) <r- (RS)48:63 

Let the effective address (EA) be the sum (RAIO)+D. (RS)48:63 are 
stored into the halfword in storage addressed by EA. 

Special Registers Altered 

None 



Store Halfword Indexed X-form 

sthx RS,RA,RB 



31 


RS 


RA 


RB 


407 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b ^ 0 
else b ^ (RA) 

EA <- b + (RB) 
MEM(EA, 2) <- (RS)48:63 

Let the effective address (EA) be the sum (RAIO)+(RB). (RS)48.53 are 
stored into the halfword in storage addressed by EA. 

Special Registers Altered 

None 



Store Halfword with Update D-form 

sthu RS,D(RA) 



45 


RS 


RA 




D 




0 


6 


11 


16 




31 



EA ^ (RA) + EXTS(D) 
MEM(EA, 2) <r- (RS)48.63 
RA EA 
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Let the effective address (EA) be the sum (RA)+D. (RS)43.53 are 
stored into the halfword in storage addressed by EA. 
EA is placed into register RA. 
If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 



Store Halfword with Update Indexed X-form 

sthux RS,RA,RB 



31 


RS 


RA 


RB 


439 


/ 


0 


6 


11 


16 


21 


31 



EA <- (RA) + (RB) 
MEM(EA, 2) <- (RS)48:63 
RA ^ EA 

Let the effective address (EA) be the sum (RA)+(RB). (RS)48.53 are 
stored into the halfword in storage addressed by EA. 
EA is placed into register RA. 
If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 



Store Word D-form 

stw RS,D(RA) 
[Power mnemonic: st] 





36 


RS 


RA 




D 




0 




6 


11 


16 




31 



if RA = 0 then b 0 
else b ^ (RA) 

EA ^ b + EXTS(D) 
MEM(EA, 4) <r- (RS)32:63 

Let the effective address (EA) be the sum (RAI0)+D. (RS)32:63 are 
stored into the word in storage addressed by EA. 

Special Registers Altered 

None 
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Store Word Indexed X-form 

stwx RS,RA,RB 
[Power mnemonic: stx] 



31 


RS 


RA 


RB 


151 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b <- 0 
else b <r- (RA) 

EA ^ b + (RB) 
MEM(EA, 4) 4- (RS)32:63 

Let the effective address (EA) be the sum (RAIO)+(RB). (RS)32:63 are 
stored into the word in storage addressed by EA. 

Special Registers Altered 

None 



Store Word with Update D-f orm 

stwu RS,D(RA) 
[Power mnemonic: stu] 



37 


RS 


RA 




D 




0 


6 


11 


16 




31 



EA ^ (RA) + EXTS(D) 
MEM(EA, 4) <- (RS)32:63 
RA <- EA 

Let the effective address (EA) be the sum (RA)+D. (RS)32:63 are 
stored into the word in storage addressed by EA. 
EA is placed into register RA. 
If RA=05 the instruction form is invaUd. 

Special Registers Altered 

None 
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Store Word with Update Indexed X-form 

stwux RS,RA,RB 
[Power mnemonic: stux] 



31 


RS 


RA 


RB 


183 


/ 


0 


6 


11 


16 


21 


31 



EA «- (RA) + (RB) 
MEM(EA, 4) <- (RS)32:63 
RA <r- EA 

Let the effective address (EA) be the sum (RA)+(RB). (RS)32:63 are 
stored into the word in storage addressed by EA. 
EA is placed into register RA. 
If RA=0, the instruction form is invahd. 

Special Registers Altered 

None 



Store Doubleword DS-form 

std RS,DS(RA) 





62 


RS 


RA 




DS 


0 


0 




6 


11 


16 




30 31 



if RA = 0 then b ^ 0 
else b <- (RA) 

EA b + EXTS(DSIlObOO) 
MEM(EA. 8) <- (RS) 

Let the effective address (EA) be the sum (RAIO)+(DS||ObOO). (RS) is 
stored into the doubleword in storage addressed by EA. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 
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Store Doubleword Indexed X-form 

stdx RS,RA,RB 



31 


RS 


RA 


RB 


149 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b <- 0 
else b <- (RA) 

EA f- b + (RB) 
MEM(EA. 8) <r- (RS) 

Let the effective address (EA) be the sum (RAIO)+(RB). (RS) is stored 
into the doubleword in storage addressed by EA. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 



Store Doubleword with Update DS-f orm 

stdu RS,DS(RA) 





62 


RS 


RA 




DS 


1 


0 




6 


11 


16 




30 31 



EA <- (RA) + EXTS(DSllObOO) 
MEM(EA, 8) <- (RS) 
RA ^ EA 

Let the effective address (EA) be the sum (RA)+(DS||ObOO). (RS) is 
stored into the doubleword in storage addressed by EA. 
EA is placed into register RA. 
If RA=0, the instruction form is invalid. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 
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Store Doubleword with Update Indexed X-form 

stdux RS,RA,RB 



31 


RS 


RA 


RB 


181 


/ 


0 


6 


11 


16 


21 


31 



EA (RA) + (RB) 
MEM(EA, 8) <- (RS) 
RA ^ EA 

Let the effective address (EA) be the sum (RA)+(RB). (RS) is stored 
into the doubleword in storage addressed by EA. 
EA is placed into register RA. 
If RA=0, the instruction form is invalid. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 



Programming Note 

In some 

implementations, the 
Load Byte-Reverse 
Instructions may have 
greater latency than 
other Load instructions. 



3.3.4 Fixed-Point Load and Store with Byte 
Reversal instructions 

When used in a PowerPC system operating with Big-Endian byte order 
(the default), these instructions have the effect of loading and storing data 
in Little-Endian order. Likewise, when used in a PowerPC system operat- 
ing with Little-Endian byte order, these instructions have the effect of 
loading and storing data in Big-Endian order. See Appendix D, "Little- 
Endian Byte Ordering," on page 233 for a discussion of byte order. 

Load Halfword Byte-Reverse Indexed X-form 

Ihbrx RT,RA,RB 



31 


RT 


RA 


RB 


790 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b 0 
else b <- (RA) 

EA <- b + (RB) 

RT <- 48o II MEM(EA+1, 1) II MEM(EA, 1) 
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Let the effective address (EA) be the sum (RAI0)+(RB). Bits 0:7 of the 
halfword in storage addressed by EA are loaded into RT55.53. Bits 8:15 
of the halfword in storage addressed by EA are loaded into RT4g.55. 
RTo:47 are set to 0. 

Special Registers Altered 

None 



Load Word Byte-Reverse Indexed X-form 

Iwbrx RT,RA,RB 
[Power mnemonic: Ibrx] 



31 


RT 


RA 


RB 




534 




0 


6 


11 


16 


21 







1f RA = 0 then b <- 0 
else b <- (RA) 

EA <- b + (RB) 

RT <- 32o II MEM(EA+3. 1) || MEM(EA+2, 1) 

II MEM(EA+1. 1) II MEM(EA, 1) 

Let the effective address (EA) be the sum (RAI0)+(RB). Bits 0:7 of the 
word in storage addressed by EA are loaded into RT55.53. Bits 8:15 of 
the word in storage addressed by EA are loaded into RT4g.55. Bits 16:23 
of the word in storage addressed by EA are loaded into RT40:47. Bits 
24:31 of the word in storage addressed by EA are loaded into RT32;39. 
RTo:3i are set to 0. 

Special Registers Altered 

None 



Store Halfword Byte-Reverse Indexed X-form 

sthbrx RS,RA,RB 



31 


RS 


RA 


RB 


918 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b <- 0 
else b <r- (RA) 

EA <- b + (RB) 

MEM(EA, 2) ^ (RS)56:63 II (RS)48:55 
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Let the effective address (EA) be the sum (RAI0)+(RB). (RS)55.63 are 
stored into bits 0:7 of the halfword in storage addressed by EA. 
(RS)48.55 are stored into bits 8:15 of the halfword in storage addressed 
byEA* 

Special Registers Altered 

None 



Store Word Byte-Reverse Indexed X-form 

stwbrx RS,RA,RB 
[Power mnemonic: stbrx] 



31 


RS 


RA 


RB 


662 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b <- 0 
else b ^ (RA) 

EA 4- b + (RB) 

MEM(EA, 4) (RS)56:63 II (^^)aS:55 II (RS)40:47 II (RS)32:39 

Let the effective address (EA) be the sum (RAI0)+(RB). (RS)55.63 are 
stored into bits 0:7 of the word in storage addressed by EA. (RS)48;55 are 
stored into bits 8:15 of the word in storage addressed by EA. (RS)40:47 
are stored into bits 16:23 of the word in storage addressed by EA. 
(RS)32.39 are stored into bits 24:31 of the word in storage addressed by 
EA. 

Special Registers Altered 

None 



3.3.5 Fixed-Point Load and Store iVIuitipie 
instructions 

The Load/Store Multiple instructions have preferred forms: see Section 
1.9.1, "Preferred Instruction Forms," on page 25. In the preferred forms, 
storage alignment satisfies the following rule. 

■ The combination of the EA and RT (RS) is such that the low-order 
byte of GPR 31 is loaded (stored) from (into) the last byte of an 
aligned quadword in storage. 



Compatibility Note 

For a discussion of 
POWER compatibility 
with respect to tlie 
alignment of the EA for 
the Load Multiple Word 
and Store 
Multiple Word 
instructions, please refer 
to Appendix G, 
"Incompatibilities with 
the POWER 
Architecture," on 
page 271. For 
compatibility with future 
versions of this 
architecture, these EAs 
should be word-aligned. 
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On PowerPC systems operating with Little-Endian byte order, execu- 
tion of a Load Multiple or Store Multiple instruction causes the system 
ahgnment error handler to be invoked. See Appendix D, "Little-Endian 
Byte Ordering," on page 233. 



Load Multiple Word D-form 

Imw RT,D(RA) 
[Power mnemonic: Im] 



46 


RT 


RA 




D 




0 


6 


11 


16 




31 



if RA = 0 then b f~ 0 
else b <- (RA) 

EA ^ b + EXTS(D) 
r <- RT 

do while r < 31 

GPR(r) II MEMCEA, 4) 

r <- r + 1 
EA <- EA + 4 

Let n = (32-RT). Let the effective address (EA) be the sum (RAIO)+D. 

n consecutive words starting at EA are loaded into the low-order 32 
bits of GPRs RT through 31. The high-order 32 bits of these GPRs are 
set to zero. 

EA must be a multiple of 4. If it is not, either the system alignment 
error handler is invoked or the results are boundedly undefined. 

If RA is in the range of registers to be loaded or RT=RA=0, the 
instruction form is invalid. 

Special Registers Altered 

None 
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Store Multiple Word D-form 

stmw RS,D(RA) 
[Power mnemonic: stm] 



47 


RS 


RA 




D 




0 


6 


11 


16 




31 



if RA = 0 then b ^ 0 
else b <r- (RA) 

EA ^ b + EXTS(D) 
r <- RS 

do whi 1 e r < 31 

MEM(EA, 4) <r- GPR(r)32:63 
r f- r + 1 
EA «- EA + 4 

Let n = (32-RS). Let the effective address (EA) be the sum (RAI0)+D. 

n consecutive words starting at EA are stored from the low-order 32 
bits of GPRs RS through 31. 

EA must be a multiple of 4. If it is not, either the system alignment 
error handler is invoked or the results are boundedly undefined. 

Special Registers Altered 

None 

3.3.6 Fixed-Point IVIove Assist Instructions 

The Move Assist instructions allow movement of data from storage to 
registers or from registers to storage without concern for alignment. 
These instructions can be used for a short move between arbitrary stor- 
age locations or to initiate a long move between unaligned storage fields. 

The Load/Store String instructions have preferred forms: see Section 
1.9. 1, "Preferred Instruction Forms," on page 25. In the preferred forms, 
register usage satisfies the following rules. 

■ RS = 5 

■ RT = 5 

■ last register loaded/stored < 12 

On PowerPC systems operating with Little-Endian byte order, execu- 
tion of a Load/Store String instruction causes the system alignment error 
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handler to be invoked. See Appendix D, "Little-Endian Byte Ordering," 
on page 233. 



Load String Word Immediate X-form 

iswi RT,RA,NB 
[Power mnemonic: Isi] 



31 


RT 


RA 


NB 


597 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then EA <- 0 
else EA <- (RA) 

if NB = 0 then n «- 32 
else n <- NB 

r <- RT - 1 
i ^ 32 

do while n > 0 
if i = 32 then 

r <- r + 1 (mod 32) 
GPR(r) <r- 0 
GPR(r)i.i+7 <r- MEM(EA, 1) 
i <- i + 8 

if i = 64 then i <- 32 
EA <^ EA + 1 
n <- n - 1 

Let the effective address (EA) be (RAIO). Let n = NB if NB^^O, n = 32 if 
NB=0: n is the number of bytes to load. Let nr = CEIL(n^4): nr is the 
number of registers to receive data. 

n consecutive bytes starting at EA are loaded into GPRs RT through 
RT+nr-1. Data are loaded into the low-order four bytes of each GPR; the 
high-order four bytes are set to 0. 

Bytes are loaded left to right in each register. The sequence of registers 
wraps around to GPR 0 if required. If the low-order four bytes of register 
RT+nr-1 are only partially filled, the unfilled low-order byte(s) of that 
register are set to 0. 

If RA is in the range of registers to be loaded or RT=RA=0, the 
instruction form is invalid. 

Special Registers Altered 

None 
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Load String Word Indexed X-form 

Iswx RT,RA,RB 
[Power mnemonic: Isx] 



31 


RT 


RA 


RB 


533 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b <- 0 
else b <- (RA) 

EA b + (RB) 
n <r- XER25:3i 
r <- RT - 1 

1 <r- 32 

RT <- undefined 
do while n > 0 
if i = 32 then 

r ^ r + 1 (mod 32) 
GPR(r) <r- 0 
GPR(r)T.i+7 «- MEM(EA, 1) 
i <- i + 8 

if i = 64 then i <- 32 
EA 4- EA + 1 
n <- n - 1 

Let the effective address (EA) be the sum (RAI0)+(RB). Let n = 
XER25:3i: n is the number of bytes to load. Let nr = CEIL(n-5-4): nr is the 
number of registers to receive data. 

If n>0, n consecutive bytes starting at EA are loaded into GPRs RT 
through RT+nr-1 . Data are loaded into the low-order four bytes of each 
GPR; the high-order four bytes are set to 0. 

Bytes are loaded left to right in each register. The sequence of registers 
wraps around to GPR 0 if required. If the low-order four bytes of register 
RT+nr-i are only partially filled, the unfilled low-order byte(s) of that 
register are set to 0. 

If n=0, the content of register RT is undefined. 

If RA or RB is in the range of registers to be loaded, either the system 
illegal instruction error handler is invoked or the results are boundedly 
undefined. If RT=RA=0, the instruction form is invalid. 

Special Registers Altered 

None 
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Store String Word Immediate X-form 

stswi RS,RA,NB 
[Power mnemonic: stsi] 



31 


RS 


RA 


NB 


725 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then EA f- 0 
else EA <- (RA) 

if NB = 0 then n 4- 32 
else n <- NB 

r RS - 1 
i <r- 32 

do while n > 0 

if i = 32 then r <- r + 1 (mod 32) 
MEM(EA. 1) <r- GPR(r)i.i+7 
i ^ i + 8 

if i = 64 then i <- 32 
EA ^ EA + 1 
n n - 1 

Let the effective address (EA) be (RAIO). Let n = NB if NB^^O, n = 32 if 
NB=0: n is the number of bytes to store. Let nr = CEIL(n^4): nr is the 
number of registers to supply data. 

n consecutive bytes starting at EA are stored from GPRs RS through 
RS+nr-1. Data are stored from the low-order four bytes of each GPR. 

Bytes are stored left to right from each register. The sequence of regis- 
ters wraps around to GPR 0 if required. 

Special Registers Altered 

None 
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Store String Word Indexed X-form 

stswx RS,RA,RB 
[Power mnemonic: stsx] 



31 


RS 


RA 


RB 


661 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b <- 0 
else b «- (RA) 

EA <- b + (RB) 
n ^ XER25:31 
r ^ RS - 1 

1 <r- 32 

do while n > 0 

if i = 32 then r <- r + 1 (mod 32) 
MEM(EA. 1) ^ GPR(r)i.i+7 
i i + 8 

if i = 64 then i <- 32 
EA <- EA + 1 
n n - 1 

Let the effective address (EA) be the sum (RAI0)+(RB). Let n = 
XER25:3i: n is the number of bytes to store. Let nr = CEIL(n-5-4): nr is the 
number of registers to supply data. 

n consecutive bytes starting at EA are stored from GPRs RS through 
RS+nr~l. Data are stored from the low^-order four bytes of each GPR. 

Bytes are stored left to right from each register. The sequence of regis- 
ters wraps around to GPR 0 if required. 

If n = 0, no bytes are stored. 

Special Registers Altered 

None 

3.3.7 Storage Synchronization Instructions 

The Storage Synchronization instructions can be used to control the order 
in which storage operations are completed with respect to asynchronous 
events, and the order in which storage operations are seen by other pro- 
cessors and by other mechanisms that access storage. Additional informa- 
tion about these instructions and about related aspects of storage 
management can be found in Book II, Sections 1.8.1, "Storage Access 
Ordering," on page 333 and 1.8.2, "Atomic Update Primitives," on 
page 336, and Book III, Chapter 4, "Storage Control," on page 391. 



Programming Note 

Because the Storage 
Synchronization 
instructions have 
implementation 
dependencies (e.g., the 
granularity at which 
reservations are 
managed), they must be 
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On a PowerPC system operating with Little-Endian byte order the 
three low-order bits of the effective address computed by Load And 
Reserve and Store Conditional are modified before accessing storage. See 
Appendix D, "Little-Endian Byte Ordering," on page 233. 

Load Word And Reserve Indexed X-form 

Iwarx RT,RA,RB 



31 


RT 


RA 


RB 


20 


/ 


0 


6 


11 


16 


21 


31 



1f RA = 0 then 
el se 

EA <- b + (RB) 
RESERVE <- 1 
RESERVE_ADDR ^ 



0 

(RA) 



func(EA) 



RT 



32 



0 II MEM(EA. 4) 



Let the effective address (EA) be the sum (RAIO)+(RB). The word in 
storage addressed by EA is loaded into RT32:63. RTo:3i are set to 0. 

This instruction creates a reservation for use by a Store Word Condi- 
tional instruction. An address computed from the EA is associated with 
the reservation and replaces any address previously associated with the 
reservation: the manner in which the address to be associated with the 
reservation is computed from the EA is described in Book II, Section 
1.8.2, "Atomic Update Primitives," on page 336. 

EA must be a multiple of 4. If it is not, either the system alignment 
error handler is invoked or the results are boundedly undefined. 

Special Registers Altered 

None 

Load Doubleword And Reserve Indexed X-form 

Idarx RT,RA,RB 



31 


RT 


RA 


RB 


84 
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0 


6 


11 


16 


21 


31 



1f RA = 0 then b <- 0 
else b <- (RA) 

EA <- b + (RB) 
RESERVE <r- 1 

RESERVE_ADDR <- func(EA) 
RT <r- MEM(EA, 8) 



used with care. The 
operating system should 
provide system library 
programs that use these 
instructions to implement 
the high-level 
synchronization functions 
(Test and Set, Compare 
and Swap, etc.) needed 
by application 
programs. Application 
programs should use 
these library programs, 
rather than use the 
Storage Synchronization 
instructions directly. 



Programming Note 

The granularity with 
which reservations are 
managed is 
implementation- 
dependent. Therefore 
the storage to be 
accessed by the Loaof And 
Reserve and Store 
Conditional instructions 
should be allocated by a 
system library program. 
Additional information 
can be found in Book II, 
Section 1 .8.2, "Atomic 
Update Primitives," on 
page 336. 
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Programming Note 

When correctly used, the 
Load And Reserve and 
Store Conditional 
instructions can provide 
an atomic update 
function for a single 
aligned word {Load Word 
And Reserve and Store 
Word Conditional) or 
doubleword {Load 
Doubleword And 
Reserve and Store 
Doubleword 
Conditional) of storage. 

One of the requirements 
for correct use is that 
Load Word And Reserve 
be paired with Store 
Word Conditional, and 
Load DoubleWord And 
Reserve with Store 
Doubleword Conditional, 
with the same effective 
address used for both 
instructions of the pair. 
Examples of correct uses 
of these instructions to 
emulate primitives such 
as "Fetch and Add," "Test 
and Set," and "Compare 
and Swap" can be found 
in Appendix E.1, 
"Synchronization," on 
page 249. 

At most one reservation 
exists on any given 
processor: there are not 
separate reservations for 
words and for 
doublewords. 

The conditionality of the 
Store Conditional 
instruction's store is 
based only on whether a 



Let the effective address (EA) be the sum (RAI0)+(RB). The double- 
word in storage addressed by EA is loaded into RT. 

This instruction creates a reservation for use by a Store Doubleword 
Conditional instruction. An address computed from the EA is associated 
with the reservation and replaces any address previously associated with 
the reservation: the manner in which the address to be associated with the 
reservation is computed from the EA is described in Book II, Section 
1.8.2, "Atomic Update Primitives," on page 336. 

EA must be a multiple of 8. If it is not, either the system alignment 
error handler is invoked or the results are boundedly undefined. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 

Store Word Conditional indexed X-form 

stwcx. RS,RA,RB 



31 


RS 


RA 


RB 


150 
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31 



if RA = 0 then b <- 0 
else b <r- (RA) 

EA «- b + (RB) 
if RESERVE then 

MEM(EA. 4) 4- (RS)32:63 

RESERVE <r- 0 

CRO ^ ObOO II Obi II XERso 
el se 

CRO ^ ObOO 11 ObO i| XERso 

Let the effective address (EA) be the sum (RAI0)+(RB). 

If a reservation exists, (RS)32:63 stored into the word in storage 
addressed by EA and the reservation is cleared. 

If a reservation does not exist, the instruction completes without alter- 
ing storage. 

CR Field 0 is set to reflect whether the store operation was performed 
(i.e., whether a reservation existed when the stwcx, instruction com- 
menced execution), as follows. 

CROlt gT EQ S0= ObOO II store_performed || XERso 
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EA must be a multiple of 4. If it is not, either the system alignment 
error handler is invoked or the results are boundedly undefined. 

Special Registers Altered 

CRO 

Store Doubleword Conditional indexed X-form 

stdcx. RS,RA,RB 



31 


RS 


RA 


RB 


214 


1 


0 


6 


11 


16 


21 


31 



if RA = 0 then b <- 0 
else b <r- (RA) 

EA b + (RB) 
if RESERVE then 

MEM(EA, 8) «- (RS) 

RESERVE f~ 0 

CRO <- ObOO II Obi II XERso 

el se 

CRO <- ObOO II ObO || XERso 

Let the effective address (EA) be the sum (RAI0)+(RB). 

If a reservation exists, (RS) is stored into the doubleword in storage 
addressed by EA and the reservation is cleared. 

If a reservation does not exist, the instruction completes without alter- 
ing storage. 

CR Field 0 is set to reflect whether the store operation was performed 
(i.e., whether a reservation existed when the stdcx. instruction com- 
menced execution), as follows. 

CROlt GT EQ so = ObOO II store_performed || XERgQ 

EA must be a multiple of 8, If it is not, either the system alignment 
error handler is invoked or the results are boundedly undefined. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



reservation exists, not on 
a match between the 
address associated with 
the reservation and the 
address computed from 
the EA of the Store 
Conditional instruction. 

A reservation is cleared if 
any of the following 
events occurs. 

■ The processor holding 
the reservation 
executes another Loacf 
And Reserve 
instruction; this clears 
the first reservation 
and establishes a new 
one. 

■ The processor holding 
the reservation 
executes a Store 
Conditional 
instruction to any 
address. 

■ Another processor 
executes any Store 
instruction to the 
address associated with 
the reservation. 

■ Any mechanism, other 
than the processor 
holding the 
reservation, stores to 
the address associated 
with the reservation. 

See Book II, Section 1.8.2, 
"Atomic Update 
Primitives," on page 336 
for additional 
information. 



Special Registers Altered 

CRO 
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Programming Note 

The sync instruction can 
be used to ensure that 
the results of all stores 
Into a data structure that 
are performed in a 
"critical section" of a 
program are seen by 
other processors before 
the data structure is seen 
as unlocked. 

The functions performed 
by the sync instruction 
will normally take a 
significant amount of 
time to complete, so 
indiscriminate use of this 
instruction may 
adversely affect 
performance. In addition, 
the time required to 
execute sync may vary 
from one execution to 
another. 

The Enforce In-order 
Execution of I/O (e/e/o) 
instruction, described in 
Book II, Sections 1.8.1, 
"Storage Access 
Ordering," on page 333 
and 3.3, "Enforce In- 
order Execution of I/O 
Instruction," on 
page 350 may be more 
appropriate than syncior 
cases in which the only 
requirement is to control 
the order in which 
storage references are 
seen by I/O devices. 



Synchronize X-form 



sync 

[Power mnemonic: dcs] 



31 


III 


III 


III 
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The sync instruction provides an ordering function for the effects of all 
instructions executed by a given processor. Executing a sync instruction 
ensures that all instructions previously initiated by the given processor 
appear to have completed before the sync instruction completes, and that 
no subsequent instructions are initiated by the given processor until after 
the sync instruction completes. When the sync instruction completes, all 
storage accesses initiated by the given processor prior to the sync instruc- 
tion will have been performed with respect to all other mechanisms that 
access storage. (See Book II, "Synchronize," on page 334 for a more 
complete description. See also Book III, Section 4.12, "Table Update Syn- 
chronization Requirements," on page 446 for an exception involving 
TLB invahdates.) 

This instruction is execution synchronizing (see Book III, Section 
1.7.2, "Execution Synchronization," on page 372). 

Special Registers Altered 

None 



3.3.8 Other Fixed-Point instructions 

The remainder of the fixed-point instructions use the contents of the Gen- 
eral Purpose Registers (GPRs) as source operands, and place results into 
GPRs, into the Fixed-Point Exception Register (XER), and into Condi- 
tion Register fields. In addition, the Trap instructions compare the con- 
tents of one GPR with a second GPR or immediate data and, if the 
conditions are met, invoke the system trap handler. 

These instructions treat the source operands as signed integers unless 
the instruction is explicitly identified as performing an unsigned opera- 
tion. 

The X-form and XO-form instructions with Rc=l, and the D-form 
instructions addic, andi,, and andis,, set the first three bits of CR Field 0 
to characterize the result placed into the target register. In 64-bit mode, 
these bits are set by signed comparison of the result to zero. In 32-bit 
mode, these bits are set by signed comparison of the low-order 32 bits of 
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the result to zero. 

Unless otherwise noted and when appropriate, when CR Field 0 and 
the XER are set they reflect the value placed into the target register. 

3.3.9 Fixed-Point Arithmetic instructions 

The XO-form Arithmetic instructions with Rc=l, and the D-form Arith- 
metic instruction addic.^ set the first three bits of CR Field 0 as described 
in Section 3.3.8, "Other Fixed-Point Instructions," on page 80. 

addic, addic, subfic, addc, subfc, adde, subfe, addme, subfme, addze, 
and subfze always set CA, to reflect the carry out of bit 0 in 64-bit mode 
and out of bit 32 in 32-bit mode. The XO-form Arithmetic instructions 
set SO and OV when OE=l to reflect overflow of the result. Except for 
the Multiply Low and Divide instructions, the setting of these bits is 
mode-dependent, and reflects overflow of the 64-bit result in 64-bit mode 
and overflow of the low-order 32-bit result in 32-bit mode. For XO-form 
Multiply Low and Divide instructions, the setting of these bits is mode- 
independent, and reflects overflow of the 64-bit result for mulld, divd, 
and divdu, and overflow of the low-order 32-bit result for mullw, dimv, 
and dimvu. 

Extended mnemonics for addition and subtraction 

Several extended mnemonics are provided that use the Add Immediate 
and Add Immediate Shifted instructions to load an immediate value or an 
address into a target register. Some of these are shown as examples with 
the two instructions. 

The PowerPC Architecture supplies Subtract From instructions, which 
subtract the second operand from the third. A set of extended mnemonics 
is provided that use the more "normal" order, in which the third operand 
is subtracted from the second, with the third operand being either an 
immediate field or a register. Some of these are shown as examples with 
the appropriate Add and Subtract From instructions. 

See Appendix C, "Assembler Extended Mnemonics," on page 215 for 
additional extended mnemonics. 



Programming Note 

Instructions with the OE 
bit set or which set CA 
may execute slowly or 
may prevent the 
execution of subsequent 
instructions until the 
operation is completed. 



Programming Note 

Notice that CR Field 0 
may not reflect the 
"true" (infinitely precise) 
result if overflow occurs. 
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Programming Note 

addi, addis, add, and 
subf are the preferred 
instructions for addition 
and subtraction, because 
they set few status bits. 

Notice that addi and 
addis use the value 0, not 
the contents of GPR 0, if 
RA=0. 



Add Immediate D-form 

addi RT,RA,SI 
[Power mnemonic: cal] 





14 


RT 


RA 




SI 
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1f RA = 0 then RT ^ EXTS(SI) 

else RT ^ (RA) + EXTS(SI) 

The sum (RAIO) + SI is placed into register RT. 



Special Registers Altered 

None 



Extended Mnemonics: 

Examples of extended mne 

Extended: 
li Rx, value 
la Rx,disp(Ry) 
subi Rx,Ry,value 



for Add Immediate: 

Equivalent to: 
addi Rx,0,value 
addi Rx,Ry,disp 
addi Rx,Ry5-value 



Add immediate Shifted D-form 

addis RT,RA,SI 
[Power mnemonic: cau] 
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SI 
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if RA = Q then RT 4- EXTS(SI || ^^0) 

else RT <- (RA) + EXTS(SI || ^^0) 

The sum (RAIO) + (SI || 0x0000) is placed into register RT. 

Special Registers Altered 

None 
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Extended Mnemonics: 

Examples of extended mnemonics for Add Immediate Shifted: 



Extended: 

lis Rx, value 

subis Rx,Ry,value 



Equivalent to: 
addis Rx,0,value 
addis Rx,Ry,-value 



Add XO-form 



add 
add. 
addo 
addo. 



RT,RA,RB 
RT,RA,RB 
RT,RA,RB 
RT,RA,RB 



[Power mnemonics: cax, cax., caxo, caxo.] 



(OE=0 Rc=0) 
(OE=0Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 
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RB 


OE 
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RT (RA) + (RB) 

The sum (RA) + (RB) is placed into register RT. 

Special Registers Altered 

CRO 
SO ov 

Subtract From XO-form 



subf 
subf. 
subfo 
subfo. 



RT,RA,RB 
RT,RA,RB 
RT,RA,RB 
RT,RA,RB 



(if Rc=l) 
(if 0E=1) 



(OE=0 Rc=0) 
(OE=0Rc=l) 
(OE=:l Rc=0) 
(OE=l Rc=l) 
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RT < ^(RA) + (RB) + 1 

The sum -i(RA) + (RB) +1 is placed into register RT. 

Special Registers Altered 

CRO 
SO OV 



(if Rc=l) 
(if 0E=1) 
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Extended Mnemonics: 

Example of extended mnemonics for Subtract From: 

Extended: Equivalent to: 

sub Rx,Ry,Rz subf Rx^RzjRy 

Add Immediate Carrying D-form 

addic RT,RA,SI 
[Power mnemonic: ai] 
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RT 


RA 




SI 
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16 




31 



RT <- (RA) + EXTS(SI) 

The sum (RA) + SI is placed into register RT. 

Special Registers Altered 

CA 

Extended Mnemonics: 

Example of extended mnemonics for Add Immediate Carrying: 

Extended: Equivalent to: 

subic RxjRyjValue addic Rx,Ry,-value 

Add immediate Carrying and Record D-form 

addic. RT,RA,SI 
[Power mnemonic: ai.] 



RT <r- (RA) + EXTS(SI) 

The sum (RA) + SI is placed into register RT. 

Special Registers Altered 

CRO CA 





13 


RT 
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Programming Note 

The setting of CA by the 
Add and Subtract From 
instructions, including 
the Extended versions 
thereof, is mode- 
dependent. If a sequence 
of these instructions is 
used to perform 
extended-precision 
addition or subtraction, 
the same mode should be 
used throughout the 
sequence. 
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Extended iVInemonics: 

Example of extended mnemonics for Add Immediate Carrying and 
Record: 

Extended: Equivalent to: 

subic. Rx,Ry,value addic. Rx,Ry,-value 



Subtract From Immediate Carrying D-form 

subfic RT,RA,SI 
[Power mnemonic: sfi] 
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RT <r- ^(RA) + EXTS(SI) + 1 

The sum ->(RA) + SI + 1 is placed into register RT. 

Special Registers Altered 

CA 

Add Carrying XO-form 

addc RT,RA,RB 

addc. RT,RA,RB 

addco RXRA,RB 

addco. RT,RA,RB 

[Power mnemonics: a, a., ao, ao.] 
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RT <- (RA) + (RB) 

The sum (RA) + (RB) is placed into register RT. 

Special Registers Altered 

CA 

CRO 
SO OV 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 



(if Rc=l) 
(if 0E==1) 
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Subtract From Carrying XO-form 



subfc 
subfc. 
subfco 
subfco. 



RT,RA,RB 
RT,RA,RB 
RT,RA,RB 
RT,RA,RB 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 



[Power mnemonics: sf, sf., sfo, sfo.] 
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RT <- -,(RA) + (RB) + 1 

The sum "-i(RA) + (RB) + 1 is placed into register RT. 

Special Registers Altered 

CA 

CRO (if Rc=l) 

SO OV (1f 0E=1) 

Extended Mnemonics: 

Example of extended mnemonics for Subtract From Carrying: 

Extended: Equivalent to: 

subc Rx,Ry,Rz subfc Rx,Rz,Ry 



Add Extended XO-form 



adde 
adde. 

addeo 
addeo. 



RT,RA,RB 
RT,RA,RB 
RT,RA,RB 
RT,RA,RB 



(OE=0 Rc=0) 
(OE=0Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 



[Power mnemonics: ae, ae., aeo, aeo.J 
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RT ^ (RA) + (RB) + CA 

The sum (RA) + (RB) + CA is placed into register RT. 

Special Registers Altered 

CA 
CRO 
SO OV 



(if Rc=l) 
(if 0E=1) 
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Subtract From Extended XO-f orm 



subfe 
subfe. 
subfeo 
subfeo. 



RT,RA,RB 
RT,RA,RB 
RT,RA,RB 
RT,RA,RB 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 



[Power mnemonics: sfe, sfe., sfeo, sfeo.] 
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RT < ^(RA) + (RB) + CA 

The sum — i(RA) + (RB) + CA is placed into register RT. 

Special Registers Altered 

CA 

CRO 
SO OV 

Add to Minus One Extended XO-form 



addme 
addme. 
addmeo 
addmeo. 



RT,RA 
RT,RA 
RT,RA 
RT,RA 



(if Rc=l) 
(if 0E=1) 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 



[Power mnemonics: ame, ame., ameo, ameo.] 
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RT <- (RA) + CA - 1 

The sum (RA) + CA + ^"^1 is placed into register RT. 

Special Registers Altered 

CA 
CRO 
SO OV 



(if Rc=l) 
(if 0E=1) 



Book I PowerPC User Instruction Set Architecture 



88 



Chapter 3 Fixed-Point Processor 



Subtract From Minus One Extended XO-form 



subfme 
subfme. 
subfmeo 
subfmeo. 



RT,RA 
RT,RA 
RT,RA 
RT,RA 



(OE=0 Rc=0) 
(OE=0Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 



[Power mnemonics: sfme, sfme., sfmeo, sfmeo.] 
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RT ^ -,(RA) + CA - 1 

The sum -i(RA) + CA + ^"^1 is placed into register RT. 

Special Registers Altered 

CA 

CRO 
SO OV 

Add to Zero Extended XO-form 



addze 
addze. 
addzeo 
addzeo. 



RT,RA 
RT,RA 
RT,RA 
RT,RA 



[Power mnemonics: aze, aze., azeo, azeo.] 



(if Rc=l) 
(if 0E=1) 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 
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RT (RA) + CA 

The sum (RA) + CA is placed into register RT. 

Special Registers Altered 

CA 
CRO 
SO OV 



(if Rc=l) 
(if 0E=1) 
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Subtract From Zero Extended XO-f orm 



subfze 
subfze. 
subfzeo 
subfzeo. 



RT,RA 
RT,RA 
RT,RA 
RT,RA 



[Power mnemonics: sfze, sfze., sfzeo, sfzeo.] 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 
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RT < ^(RA) + CA 

The sum -i(RA) + CA is placed into register RT. 

Special Registers Altered 

CA 
CRO 
SO OV 

Negate XO-form 



neg 
neg. 
nego 
nego. 



RT,RA 
RT,RA 
RT,RA 
RT,RA 



(if Rc=l) 
(if 0E=1) 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 
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RT 



-n(RA) + 1 



The sum — i(RA) + 1 is placed into register RT. 

If executing in 64-bit mode and register RA contain the most negative 
64-bit number (Ox8000_0000_0000_0000), the result is the most nega- 
tive number and, if OE=l, OV is set to 1. Similarly, if executing in 32-bit 
mode and {RA)32:63 contains the most negative 32-bit number 
(Ox8000_0000), the low-order 32 bits of the result contain the most neg- 
ative 32-bit number and, if OE=l, OV is set to 1. 



Special Registers Altered 

CRO 
SO OV 



(if Rc=l) 
(if 0E=1) 
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Programming Note 

For mulli and mullw, the 
low-order 32 bits of the 
product are the correct 
32-bit product for 32-bit 
mode. 

For mulli and mulld, the 
low-order 64 bits of the 
product are independent 
of whether the operands 
are regarded as signed or 
unsigned 64-bit integers. 
For mulli and mullw, the 
low-order 32 bits of the 
product are independent 
of whether the operands 
are regarded as signed or 
unsigned 32-bit integers. 



Multiply Low Immediate D-form 

mulli RT,RA,SI 
[Power mnemonic: muli] 
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prodo:i27 <- (RA) X EXTS(SI) 
RT <r- prod64:i27 

The 64-bit first operand is (RA). The 64-bit second operand is the 
sign-extended value of the SI field. The low-order 64 bits of the 128-bit 
product of the operands are placed into register RT. 

Both operands and the product are interpreted as signed integers. 

Special Registers Altered 

None 



Programming Note 

The XO-form Multiply 
instructions may execute 
faster on some imple- 
mentations if RB con- 
tains the operand having 
the smaller absolute 
value. 



Multiply Low Doubleword XO-form 



mulld 
mulld. 
mulldo 
mulldo. 



RT,RA,RB 
RT,RA,RB 
RT,RA,RB 
RT,RA,RB 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 
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prodo:i27 <- (RA) X (RB) 

RT <- prod64:127 

The 64- bit operands are (RA) and (RB). The low-order 64 bits of the 
128-bit product of the operands are placed into register RT. 

If OE=l then OV is set to 1 if the product cannot be represented in 64 
bits. 

Both operands and the product are interpreted as signed integers. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO (if Rc=l) 

SO OV (if 0E=1) 
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Multiply Low Word XO-form 



muUw 
mullw. 
mullwo 
mull wo. 



RT,RA,RB 
RT,RA,RB 
RT,RA,RB 
RT,RA,RB 



(OE=0 Rc=0) 
(OE=0Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 



[Power mnemonics: muls, muls., mulso, mulso.] 
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RT <- (RA)32:63 X (RB)32:63 

The 32-bit operands are the low-order 32 bits of RA and of RB. The 
64-bit product of the operands is placed into register RT. 

If OE=l then OV is set to 1 if the product cannot be represented in 32 
bits. 

Both operands and the product are interpreted as signed integers. 



Special Registers Altered 

CRO 
SO OV 

Multiply High Doubleword XO-form 



mulhd 
mulhd. 



RT,RA,RB 
RT,RA,RB 



(if Rc=l) 
(if 0E=1) 



(Rc=0) 
(Rc=l) 
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prodo:i27 <- (RA) X (RB) 
RT <r- prodo;63 

The 64-bit operands are (RA) and (RB). The high-order 64 bits of the 
128-bit product of the operands are placed into register RT. 

Both operands and the product are interpreted as signed integers. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO 



(If Rc=l) 
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Multiply High Word XO-form 



mulhw 
mulhw. 



RT,RA,RB 
RT,RA,RB 



(Rc=0) 
(Rc=l) 
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PI^Odo:63 ^ (RA)32:63 X (^6)32:63 
RT32:63 ^ P^OClo:3i 

RTo:3i ^ undefined 

The 32-bit operands are the low-order 32 bits of RA and of RB. The 
high-order 32 bits of the 64-bit product of the operands are placed into 
RT32:63- (RT)o:3i are undefined. 

Both operands and the product are interpreted as signed integers. 



Special Registers Altered 

CRO 

Multiply High Doubleword Unsigned XO-form 



mulhdu 
mulhdu. 



RT,RA,RB 
RT,RA,RB 



(if Rc=l) 



(Rc=0) 
(Rc=l) 
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prodo:i27 <- (RA) X (RB) 
RT «- prodo:63 

The 64-bit operands are (RA) and (RB). The high-order 64 bits of the 
128-bit product of the operands are placed into register RT. 

Both operands and the product are interpreted as unsigned integers, 
except that if Rc=l the first three bits of CR Field 0 are set by signed com- 
parison of the result to zero. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO 



(if Rc=l) 
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Multiply High Word Unsigned XO-form 

mulhwu RT,RA,RB (Rc=0) 

mulhwu. RT,RA,RB (Rc=l) 



31 


RT 


RA 


RB 


/ 


11 


Rc 


0 


6 


11 


16 


21 


22 


31 



P^Odo:63 ^ (RA)32:63 X (f^B)32:63 
RT32:63 <- P^0do:31 

RTo:3i <- undefined 

The 32-bit operands are the low-order 32 bits of RA and of RB. The 
high-order 32 bits of the 64-bit product of the operands are placed into 
RT32:63. (RT)o:3i are undefined. 

Both operands and the product are interpreted as unsigned integers, 
except that if Rc=l the first three bits of CR Field 0 are set by signed com- 
parison of the result to zero. 

Special Registers Altered 

CRO (if Rc=l) 
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Programming Note 

The 64-bit signed 
remainder of dividing 
(RA) by (RB) can be 
computed as follows, 
except in the case that 
(RA) = -2^3 and (RB) = - 



Divide Doubleword XO-form 



1. 



divd RT.RA.RB 

# RT = quotient 
mu1ld RT,RT,RB 

# RT = quotient*di visor 
subf RT.RT.RA 

# RT = remainder 



divd 
divd. 
divdo 
divdo. 



RT,RA,RB 
RT,RA,RB 
RT,RA,RB 
RT,RA,RB 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 



31 


RT 


RA 


RB 




489 


Rc 


0 


6 


11 


16 


21 


22 


31 



d1 videndo:63 <- (RA) 
divisoro:63 <- (RB) 
RT <- dividend ^ divi 



sor 



The 64-bit dividend is (RA). The 64-bit divisor is (RB). The 64-bit 
quotient of the dividend and divisor is placed into RT. The remainder is 
not supphed as a result. 

Both operands and the quotient are interpreted as signed integers. The 
quotient is the unique signed integer that satisfies 

dividend = (quotient x divisor) + r 

where 0 < r < \divisor\ if the dividend is nonnegative, and -Adivisor\ < r < 0 
if the dividend is negative. 

If an attempt is made to perform any of the divisions 

Ox8000_0000_0000_0000 ^ -1 
<anything> 0 

then the contents of register RT are undefined as are (if Rc=l) the con- 
tents of the LT, GT, and EQ bits of CR Field 0. In these cases, if OE=l 
then OV is set to 1 . 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO 
SO OV 



(if Rc=l) 
(if 0E=1) 
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Divide Word XO-form 



divw 
divw. 
divwo 
divwo. 



RT,RA,RB 
RT,RA,RB 
RT,RA,RB 
RT,RA,RB 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 



31 


RT 


RA 


RB 


OE 


491 


Rc 


0 


6 


11 


16 


21 


22 


31 



d1 videndo:63 <- EXTS( ( RA)32:63 ) 
d1visoro:63 ^ EXTS( (RB)32:63) 
RT32:63 ^ dividend divisor 
•^To:3i <- undefined 

The 64-bit dividend is the sign-extended value of (RA)32:63. The 64- 
bit divisor is the sign-extended value of (RB)32:63- The 64-bit quotient is 
formed. The low-order 32 bits of the 64-bit quotient are placed into 
RT32;63. (RT)o:3i are undefined. The remainder is not supplied as a 
result. 

Both operands and the quotient are interpreted as signed integers. The 
quotient is the unique signed integer that satisfies 

dividend - [quotient x divisor) + r 

where 0 < r < \divisor\ if the dividend is nonnegative, and -Adivisorl <r<0 
if the dividend is negative. 

If an attempt is made to perform any of the divisions 

0x8000.0000 -1 
<anything> -5- 0 

then the contents of register RT are undefined as are (if Rc=l) the con- 
tents of the LT, GT, and EQ bits of CR Field 0. In these cases, if OE=l 
then OV is set to 1. 



Programming Note 

The 32-bit signed 
remainder of dividing 

(RA)32:63 by (RB)32:63 

be computed as follows, 
except in the case that 
{RA) = -2^'^ and (RB) = -1. 

divw RT.RA.RB 

# RT = quotient 
mullw RT.RT.RB 

# RT = quotient*cli visor 
subf RT.RT.RA 

# RT == remainder 



Special Registers Altered 

CRO 
SO OV 



(if Rc=l) 
(if 0E=1) 
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Programming Note 

The 64-bit unsigned 
remainder of dividing 
(RA) by (RB) can be 
computed as follows. 

divdu RT.RA.RB 

# RT = quotient 
mulld RT.RT.RB 

# RT = quotient*d1 visor 
subf RT.RT.RA 

# RT = remainder 



Divide Doubleword Unsigned XO-form 



divdu 
divdu. 
divduo 
divduo. 



RT,RA,RB 
RT,RA,RB 
RT,RA,RB 
RT,RA,RB 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 



31 


RT 


RA 


RB 


OE 


457 


Rc 


0 


6 


11 


16 


21 


22 


31 



di v1dendo:63 <- (RA) 

divisoro:63 <- (RB) 

RT <- dividend ^ divisor 

The 64-bit dividend is (RA). The 64-bit divisor is (RB). The 64-bit 
quotient of the dividend and divisor is placed into RT. The remainder is 
not supphed as a resuk. 

Both operands and the quotient are interpreted as unsigned integers, 
except that if Rc=l the first three bits of CR Field 0 are set by signed com- 
parison of the result to zero. The quotient is the unique unsigned integer 
that satisfies 

dividend = {quotient x divisor) + r 

where 0 < r < divisor. 

If an attempt is made to perform the division 

<anything> ^ 0 

then the contents of register RT are undefined as are (if Rc=l) the con- 
tents of the LT, GT, and EQ bits of CR Field 0. In this case, if OE=l then 
OV is set to 1. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO 
SO OV 



(if Rc=l) 
(if 0E=1) 
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Divide Word Unsigned XO-form 



divwu 
divwu. 
divwuo 
divwuo. 



RT,RA,RB 
RT,RA,RB 
RT,RA,RB 
RT,RA,RB 



(OE=0 Rc=0) 
(OE=0 Rc=l) 
(OE=l Rc=0) 
(OE=l Rc=l) 



31 


RT 


RA 


RB 


OE 


459 


Rc 


0 


6 


11 


16 


21 


22 


31 



d1v1denclo:63 ^ H ^^^hZ:63 
d1v1SOro:63 <r- ^^0 II (RB)32:63 

^^32:63 ^ dividend ^ divisor 
f^To:3i <- undefined 

The 64-bit dividend is the zero-extended value of (RA)32:63. The 64- 
bit divisor is the zero-extended value of (RB)32:63- The 64-bit quotient is 
formed. The low-order 32 bits of the 64-bit quotient are placed into 
RT32:63. (RT)o:3i are undefined. The remainder is not supplied as a 
result. 

Both operands and the quotient are interpreted as unsigned integers, 
except that if Rc=l the first three bits of CR Field 0 are set by signed com- 
parison of the result to zero. The quotient is the unique unsigned integer 
that satisfies 

dividend = [quotient x divisor) + r 

where 0 < r < divisor. 

If an attempt is made to perform the division 

<anything> -^ 0 

then the contents of register RT are undefined as are (if Rc=l) the con- 
tents of the LT, GT, and EQ bits of CR Field 0. In this case, if OE=l then 
OV is set to 1. 



Programming Note 

The 32-bit unsigned 
remainder of dividing 

(RA)32:63 by (RB)32:63 Can 

be computed as follows. 

divwu RT.RA.RB 

# RT = quotient 
mullw RT.RT.RB 

# RT = quoti ent*cli vi sor 
subf RT.RT.RA 

# RT = remainder 



Special Registers Altered 

CRO 
SO OV 



(if Rc=l) 
(if 0E=1) 
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3.3.10 Fixed-Point Compare instructions 

The fixed-point Compare instructions compare the contents of register 
RA with (1) the sign-extended value of the SI field, (2) the zero-extended 
value of the UI field, or (3) the contents of register RB. The comparison is 
signed for cmpi and cmp^ and unsigned for cmpli and cmpL 

For 64-bit implementations, the L field controls w^hether the operands 
are treated as 64- or 32-bit quantities, as follows: 

L Operand length 



0 32-bit operands 

1 64-bit operands 

When the operands are treated as 32-bit signed quantities, bit 32 of 
the register (RA or RB) is the sign bit. 

For 32-bit implementations, the L field must be zero. 

The Compare instructions set one bit in the leftmost three bits of the 
designated CR field to 1, and the other two to 0. XER^o is copied into 
bit 3 of the designated CR field. 

The CR field is set as follows. 

Bit Name Description 

0 LT (RA) < SI or (RB) (signed comparison) 

(RA) itUI or (RB) (unsigned comparison) 

1 GT (RA) > SI or (RB) (signed comparison) 

(RA) UI or (RB) (unsigned comparison) 

2 EQ (RA) = SI, UI, or (RB) 

3 SO Summary Overflow from the XER 



Extended mnemonics for compares 

A set of extended mnemonics is provided so that compares can be coded 
with the operand length as part of the instruction mnemonic rather than 
as a numeric operand. Some of these are shown as examples with the 
Compare instructions. The extended mnemonics for double word com- 
parisons are available only in 64-bit implementations. See Appendix C, 
"Assembler Extended Mnemonics," on page 215 for additional extended 
mnemonics. 
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Compare Immediate D-form 

cmpi BF,L,RA,SI 





11 


BF 


/ 


L 


RA 




SI 




0 




6 


9 


10 


11 


16 




31 



1f L = 0 then a <r- EXTS( ( RA)32:63) 

else a <r- (RA) 
if a < EXTS(SI) then c <r- OblOO 

else if a > EXTS(SI) then c <- ObOlO 
else c ^ ObOOl 

CR4xBF:4xBF+3 ^ C || XERsq 

The contents of register RA ((RA)32:63 sign-extended to 64 bits if L=0) 
are compared with the sign-extended value of the SI field, treating the 
operands as signed integers. The result of the comparison is placed into 
CR field BR 

In 32-bit implementations, if L=l the instruction form is invalid. 

Special Registers Altered 

CR field BF 

Extended Mnemonics: 

Examples of extended mnemonics for Compare Immediate: 

Extended: Equivalent to: 

cmpdi Rx,value cmpi 0,l,Rx,value 

cmpwi cr3,Rx, value cmpi 3,0,Rx,value 



Compare X-form 

cmp BF,L,RA,RB 



31 


BF 


/ 


L 


RA 


RB 


0 


/ 
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10 


11 


16 
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31 



if L = 0 then a EXTS( ( RA)32.63 ) 

b «- EXTS((RB)32:63) 

else a <- (RA) 

b ^ (RB) 

if a < b then c <r- OblOO 

else if a > b then c <- ObOlO 

else c <- ObOOl 

CR4xBF:4xBF+3 ^ ^ || XERso 
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The contents of register RA ((RA)32:63 if L=0) are compared with the 
contents of register KB ((RB)32;63 if L=0), treating the operands as signed 
integers. The result of the comparison is placed into CR field BE 

In 32-bit implementations, if L=l the instruction form is invalid. 

Special Registers Altered 

CR field BF 

Extended Mnemonics: 

Examples of extended mnemonics for Compare: 

Extended: Equivalent to: 

cmpd RxjRy cmp 0,l,Rx,Ry 

cmpw cr3,Rx,Ry cmp 3,0,Rx,Ry 



Compare Logical immediate D-form 

cmpH BF,L,RA,UI 





10 


BF 
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L 


RA 




UI 




0 
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9 
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11 


16 




31 



if L = 0 then a f- ^^0 || (RA)32:63 

else a <- (RA) 
if a ^ ("^^0 II UI) then c ^ OblOO 

else if a ^ ("^^0 || UI) then c <- ObOlO 
else c f- ObOOl 

CR4xBF:4xBF+3 ^ C || XERso 

The contents of register RA ((RA)32:63 zero-extended to 64 bits if 
L=0) are compared with ^^0 || UI, treating the operands as unsigned inte- 
gers. The result of the comparison is placed into CR field BF. 

In 32-bit implementations, if L=l the instruction form is invalid. 

Special Registers Altered 

CR field BF 

Extended Mnemonics: 

Examples of extended mnemonics for Compare Logical Immediate: 

Extended: Equivalent to: 

cmpldi Rx,value cmpli 0,l,Rx,value 

cmplwi cr3,Rx, value cmpli 3,0,Rx,value 
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Compare Logical X-form 

cmpl BF,L,RA,RB 



31 


BF 


/ 


L 


RA 


RB 


32 


/ 
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16 
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31 



if L = 0 then a <- ^^0 || (RA)32.63 

b ^ II (RB)32;63 

else a <- (RA) 
b <- (RB) 
if a !i b then c <- OblOO 

else if a b then c <- ObOlO 
else c ^ ObOOl 

CR4xBF:4xBF+3 ^ C || XERsq 

The contents of register RA ((RA) 32.53 if L=0) are compared with the 
contents of register RB ((RB)32:53 if L=0), treating the operands as 
unsigned integers. The resuk of the comparison is placed into CR field 
BF. 

In 32-bit implementations, if L=l the instruction form is invalid. 

Special Registers Altered 

CR field BF 

Extended Mnemonics: 

Examples of extended mnemonics for Compare Logical: 

Extended: Equivalent to: 

cmpld Rx,Ry cmpl 0,l,Rx,Ry 

cmplw cr3,Rx,Ry cmpl 3,0,Rx,Ry 

3.3.11 Fixed-Point Trap instructions 

The Trap instructions are provided to test for a specified set of condi- 
tions. If any of the conditions tested by a Trap instruction are met, the 
system trap handler is invoked. If none of the tested conditions are met, 
instruction execution continues normally. 

The contents of register RA are compared v^ith either the sign- 
extended value of the SI field or the contents of register RB, depending on 
the Trap instruction. For tdi and td^ the entire contents of RA (and RB) 
participate in the comparison; for twi and tw^ only the contents of the 
low-order 32 bits of RA (and RB) participate in the comparison. 

This comparison results in five conditions which are ANDed with TO. 
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If the result is not 0 the system trap handler is invoked. These conditions 
are: 

TO bit ANDed with Condition 

0 Less Than, using signed comparison 

1 Greater Than, using signed comparison 

2 Equal 

3 Less Than, using unsigned comparison 

4 Greater Than, using unsigned comparison 

Extended mnemonics for traps 

A set of extended mnemonics is provided so that traps can be coded v^ith 
the condition as part of the instruction mnemonic rather than as a 
numeric operand. Some of these are shovs^n as examples with the Trap 
instructions. See Appendix C, "Assembler Extended Mnemonics," on 
page 215 for additional extended mnemonics. 



Trap Doubleword Immediate D-form 

tdi TO,RA,SI 





2 


TO 


RA 




SI 




0 
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11 


16 




31 



a f- (RA) 

if (a < EXTS(SI)) & TOq then TRAP 

if (a > EXTS(SI)) & TOi then TRAP 

if (a = EXTS(SI)) & TO2 then TRAP 

if (a ^ EXTS(SI)) & TO3 then TRAP 

if (a EXTSCSD) & TO4 then TRAP 

The contents of register RA are compared with the sign-extended 
value of the SI field. If any bit in the TO field is set to 1 and its corre- 
sponding condition is met by the result of the comparison, then the sys- 
tem trap handler is invoked. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 
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Extended Mnemonics: 

Examples of extended mnemonics 

Extended: 
tdlti RxjValue 
tdnei Rx, value 



for Trap Doubleword Immediate: 

Equivalent to: 

tdi 16,Rx,value 

tdi 24,Rx,value 



Trap Word Immediate D-form 

twi TO,RA,SI 
[Power mnemonic: ti] 





3 


TO 


RA 




SI 
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16 
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a ^ EXTS((RA)32:63 ) 



if 


(a 


< 


EXTS(SI)) ^ 


. TOo 


then 


TRAP 


if 


(a 


> 


EXTS(SI)) I 


. TOi 


then 


TRAP 


if 


(a 




EXTS(SI)) ^ 


i TO2 


then 


TRAP 


if 


(a 




EXTS(SI)) ? 


, TO3 


then 


TRAP 


if 


(a 




EXTS(SI)) I 


. TO4 


then 


TRAP 



The contents of RA32:63 are compared with the sign-extended value of 
the SI field. If any bit in the TO field is set to 1 and its corresponding con- 
dition is met by the result of the comparison, then the system trap han- 
dler is invoked. 

Special Registers Altered 

None 



Extended iVinemonics: 

Examples of extended mnemonics for Trap Word Immediate: 

Extended: Equivalent to: 

twgti RxjValue twi 8, Rx, value 

twUei Rx, value twi 6, Rx, value 



Book I PowerPC User Instruction Set Architecture 



104 



Chapter 3 Fixed-Point Processor 



Trap Doubleword X-form 

td TO,RA,RB 



31 


TO 


RA 


RB 


68 
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31 



a <- (RA) 
b ^ (RB) 

if (a < b) & TOo then TRAP 

if (a > b) & TOi then TRAP 

if (a = b) & TO2 then TRAP 

if (a it b) & TO3 then TRAP 

if (a b) & TO4 then TRAP 

The contents of register RA are compared with the contents of register 
RB. If any bit in the TO field is set to 1 and its corresponding condition is 
met by the resuh of the comparison, then the system trap handler is 
invoked. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

None 



Extended Mnemonics: 

Examples of extended mnemonics for Trap Doubleword: 

Extended: Equivalent to: 

tdge Rx,Ry td 12,Rx,Ry 

tdlnl Rx,Ry td 5,Rx,Ry 
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Trap Word X-form 

tw TO,RA,RB 
[Power mnemonic: t] 



31 


TO 


RA 


RB 


4 
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31 



a EXTS((RA)32.63 ) 

b 4- EXTS((RB)32:63) 

if (a < b) & TOo then TRAP 

1f (a > b) & TOi then TRAP 

if (a = b) & TO2 then TRAP 

if (a ^ b) & TO3 then TRAP 

if id :^ b) ^ TO4 then TRAP 

The contents of RA32:63 are compared with the contents of RB32;63- If 
any bit in the TO field is set to 1 and its corresponding condition is met 
by the result of the comparison, then the system trap handler is invoked. 

Special Registers Altered 

None 



Extended Mnemonics: 

Examples of extended mnemonics for Trap Word: 

Extended: Equivalent to: 

tweq Rx,Ry tw 4,Rx,Ry 

twlge Rx,Ry tw 5,Rx,Ry 

trap tw 31,0,0 



3.3.12 Fixed-Point Logical Instructions 

The Logical instructions perform bit-parallel operations on 64-bit oper- 
ands. 

The X-form Logical instructions with Rc=l, and the D-form Logical 
instructions andi. and andis,^ set the first three bits of CR Field 0 as 
described in Section 3.3.8, "Other Fixed-Point Instructions," on page 80. 
The Logical instructions do not change the SO, OV, and CA bits in the 
XER. 
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Extended mnemonics for logical operations 

An extended mnemonic is provided that generates the preferred form of 
"no-op" (an instruction that does nothing). This is shown as an example 
with the OR Immediate instruction. 

Extended mnemonics are provided that use the OR and NO jR instruc- 
tions to copy the contents of one register to another, with and without 
complementing. These are shown as examples with the two instructions. 

See Appendix C, "Assembler Extended Mnemonics," on page 215 for 
additional extended mnemonics. 



AND Immediate D-form 

andi. RA,RS,UI 
[Power mnemonic: andil.] 





28 


RS 


RA 




UI 
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RA <- (RS) & ("^^0 II UI) 

The contents of register RS are ANDed with ^^0 || UI and the result is 
placed into register RA. 

Special Registers Altered 

CRO 



AND Immediate Shifted D-form 

andis. RA,RS,UI 
[Power mnemonic: andiu.] 



29 


RS 


RA 




UI 




0 
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11 


16 




31 



RA <- (RS) & (2^0 II UI II 1^0) 

The contents of register RS are ANDed with ^^0 || UI || ^^0 and the 
result is placed into register RA. 



Special Registers Altered 

CRO 
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OR Immediate D-form 

ori RA,RS,UI 
[Power mnemonic: oril] 



24 
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RA 




UI 
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31 



RA <- (RS) I ("^^0 II UI) 

The contents of register RS are ORed with ^^0 jj UI and the resuk is 
placed into register RA. 

The preferred "no-op" (an instruction that does nothing) is: 

or1 0,0,0 

Special Registers Altered 

None 

Extended Mnemonics: 

Example of extended mnemonics for OR Immediate: 

Extended: Equivalent to: 

nop ori 0,0,0 



OR immediate Sliifted D-form 

oris RA,RS,UI 
[Power mnemonic: oriu] 



25 


RS 


RA 




UI 
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16 




31 



RA <- (RS) I (^^0 II UI II 1^0) 

The contents of register RS are ORed with ^^0 || UI jj ^^0 and the result 
is placed into register RA. 



Special Registers Altered 

None 
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XOR Immediate D-form 

xori RA,RS,UI 
[Power mnemonic: xoril] 





26 


RS 


RA 




UI 
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RA <- (RS) e ("^^0 II UI) 

The contents of register RS are XORed with ^^0 jj UI and the result is 
placed into register RA. 

Special Registers Altered 

None 

XOR Immediate Shifted D-form 

xoris RA,RS,UI 
[Power mnemonic: xoriu] 



27 
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RA 




UI 
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31 



RA <- (RS) e (2^0 II UI II 1^0) 

The contents of register RS are XORed with ^^0 || UI || ^^0 and the 
result is placed into register RA. 

Special Registers Altered 

None 
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AND X-form 



and 
and. 



RA,RS,RB 
RA,RS,RB 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


RB 
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Rc 


0 
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Rk ^ (RS) & (RB) 

The contents of register RS are ANDed with the contents of register 
RB and the resuk is placed into register RA. 



Special Registers Altered 

CRO 



OR X-form 



or 
or. 



RA,RS,RB 
RA,RS,RB 



(if Rc=l) 



(Rc=0) 
(Rc=l) 
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RA 


RB 
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RA <- (RS) I (RB) 

The contents of register RS are ORed with the contents of register RB 
and the result is placed into register RA. 



Special Registers Altered 

CRO 

Extended Mnemonics: 

Example of extended mnemonics for OR: 

Extended: Equivalent to: 

mr RxjRy or Rx,Ry,Ry 



(if Rc=l) 
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XOR X-form 



xor 
xor. 



RA,RS,RB 
Ra,K^,KJ5 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


RB 


316 


Rc 
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31 



RA 



(RS) e (RB) 



The contents of register RS are XORed with the contents of register 
RB and the result is placed into register RA. 



Special Registers Altered 

CRO 



NAIMD X-form 



nand 
nand. 



RA,RS,RB 
RA,RS,RB 



(if Rc=l) 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


RB 


476 


Rc 


0 


6 


11 


16 


21 


31 



RA 



^((RS) & (RB)) 



The contents of register RS are ANDed with the contents of register 
RB and the complemented result is placed into register RA. 



Special Registers Altered 

CRO 



NOR X-form 



nor 
nor. 



RA,RS,RB 
RA,RS,RB 



(if Rc=l) 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


RB 


124 


Rc 


0 


6 


11 


16 


21 


31 



RA <- -i((RS) I (RB)) 

The contents of register RS are ORed with the contents of register RB 
and the complemented result is placed into register RA. 



Special Registers Altered 

CRO 



(if Rc=l) 



Programming Note 

nand or nor with RA=RB 
can be used to obtain the 
one's complement. 
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Extended Mnemonics: 

Example of extended mnemonics for NOR: 

Extended: Equivalent to: 

not Rx,Ry nor Rx,Ry,Ry 



Equivalent X-form 



eqv 
eqv. 



RA,RS,RB 
RA,RS,RB 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


RB 


284 


Rc 


0 


6 


11 


16 


21 


31 



RA <- (RS) = (RB) 

The contents of register RS are XORed with the contents of register 
RB and the complemented result is placed into register RA. 



Special Registers Altered 

CRO 



(if Rc=l) 



AND witli Complement X-form 



andc 
andc. 



RA,RS,RB 
RA,RS,RB 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


RB 


60 


Rc 


0 


6 


11 


16 


21 


31 



RA <r- (RS) & -i(RB) 

The contents of register RS are ANDed with the complement of the 
contents of register RB and the result is placed into register RA. 



Special Registers Altered 

CRO 



(if Rc=l) 
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OR with Complement X-f orm 



ore 



RA,RS,RB 
RA,RS,RB 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


RB 


412 


Rc 


0 


6 


11 


16 


21 


31 



RA <- (RS) 



.(RB) 



The contents of register RS are ORed with the complement of the con- 
tents of register RB and the resuh is placed into register RA. 



Special Registers Altered 

CRO 



(if Rc=l) 



Extend Sign Byte X-form 



extsb 
extsb. 



RA,RS 
RA,RS 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


III 


954 


Rc 


0 


6 


11 


16 


21 


31 



S (RS)56 
f^A56:63 ^ (RS)56:63 



RAo:55 ^ 



56c 



(RS)55.53 are placed into RA55.53. Bit 56 of register RS is placed into 
RAo:55- ' 



Special Registers Altered 

CRO 



(if Rc=l) 



Book I PowerPC User Instruction Set Architecture 



3.3 Fixed-Point Processor Instructions 



113 



Extend Sign Halfword X-form 



extsh 
extsh. 



RA,RS 
RA,RS 



[Power mnemonics: exts, exts. 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


III 


922 


Rc 


0 


6 


11 


16 


21 


31 



S <- (RS)48 

RA48:63 <~ (RS)48:63 

RAo:47 ^ 

(RS)4g.53 are placed into RA48.53. register RS is placed into 

RAo:47. 



Special Registers Altered 

CRO 

Extend Sign Word X-form 



extsw 
extsw. 



RA,RS 
RA,RS 



(if Rc=l) 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


III 


986 


Rc 


0 


6 


11 


16 


21 


31 



S (RS)32 

f^A32:63 (f^S)32:63 

RAo:31 ^ 

(RS)32:63 are placed into RA32:63. Bit 32 of register RS is placed into 
RAo:31. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO 



(1f Rc=l) 
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Count Leading Zeros Doubleword X-form 



cntlzd 



RA,RS 
RA.RS 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


III 


58 


Rc 


0 


6 


11 


16 


21 


31 



Programming Note 

For both Count Leading 
Zeros instr-jcticr.s, if 
Rc=1 then LT is set toO in 
CR Field 0. 



n ^ 0 

do while n < 64 

if (RS)n = 1 then leave 

n <- n + 1 
RA ^ n 

A count of the number of consecutive zero bits starting at bit 0 of reg- 
ister RS is placed into RA. This number ranges from 0 to 64, inclusive. 
If Rc=l, CR Field 0 is set to reflect the result. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO 



(if Rc=l) 



Count Leading Zeros Word X-form 



cntlzw 
cntlzw^. 



RA,RS 
RA,RS 



[Powder mnemonics: cntlz, cntlz.] 



(Rc=0) 
(Rc=l) 





31 


RS 


RA 


III 




26 




0 




6 


11 


16 


21 







n <- 32 

do while n < 64 

if (RS)n = 1 then leave 

n <- n + 1 
RA 4- n - 32 

A count of the number of consecutive zero bits starting at bit 32 of 
register RS is placed into RA. This number ranges from 0 to 32, inclu- 
sive. 

If Rc=l, CR Field 0 is set to reflect the result. 



Special Registers Altered 

CRO 



(if Rc=l) 
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3.3.13 Fixed-Point Rotate and Shift 
instructions 

The Fixed-Point Processor performs rotation operations on data from a 
GPR and returns the result, or a portion of the resuh, to a GPR. 

The rotation operations rotate a 64-bit quantity left by a specified 
number of bit positions. Bits that exit from position 0 enter at 
position 63. 

Two types of rotation operation are supported. 

For the first type, denoted rotate54 or ROTL54, the value rotated is the 
given 64-bit value. The rotate^4 operation is used to rotate a given 64-bit 
quantity. 

For the second type, denoted rotate32 or ROTL32, the value rotated 
consists of two copies of bits 32:63 of the given 64-bit value, one copy in 
bits 0:31 and the other in bits 32:63. The rotate32 operation is used to 
rotate a given 32-bit quantity. 

The Rotate and Shift instructions employ a mask generator. The mask 
is 64 bits long, and consists of 1-bits from a start bit, mstart, through and 
including a stop bit, mstop, and 0-bits elsewhere. The values of mstart 
and mstop range from 0 to 63. If mstart > mstop, the 1-bits wrap around 
from position 63 to position 0. Thus the mask is formed as follows: 

if mstart < mstop then 

f^askmstart:nistop = ones 
maskaii other bits = zeros 
el se 

^^^Ksun:63 = ones 
i^askojmstop = oi^es 
fnaskaii other bits = zeros 

There is no way to specify an all-zero mask. 

For instructions that use the rotate32 operation, the mask start and 
stop positions are always in the low-order 32 bits of the mask. 

The use of the mask is described in following sections. 

The Rotate and Shift instructions with Rc=l set the first three bits of 
CR field 0 as described in Section 3.3.8, "Other Fixed-Point Instruc- 
tions," on page 80. Rotate and Shift instructions do not change the OV 
and SO bits. Rotate and Shift instructions, except algebraic right shifts, 
do not change the CA bit. 

Extended Mnemonics for Rotates and Shifts 

The Rotate and Shift instructions, while powerful, can be complicated to 
code (they have up to five operands). A set of extended mnemonics is 
provided that allow simpler coding of often-used functions such as clear- 
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ing the leftmost or rightmost bits of a register, left justifying or right justi- 
fying an arbitrary field, and performing simple rotates and shifts. Some of 
these extended mnemonics i^re shown examples v/ith the E.Gtate 
instructions. See Appendix C, "Assembler Extended Mnemonics," on 
page 215 for additional extended mnemonics. 



Fixed-Point Rotate instructions 



These instructions rotate the contents of a register. The result of the rota- 
tion is 

■ inserted into the target register under control of a mask (if a mask 
bit is 1 the associated bit of the rotated data are placed into the tar- 
get register, and if the mask bit is 0 the associated bit in the target 
register remains unchanged); or 

■ ANDed with a mask before being placed into the target register. 

The Rotate Left instructions allow right-rotation of the contents of a 
register to be performed (in concept) by a left- rotation of 64-n, where n 
is the number of bits by which to rotate right. They allow right-rotation 
of the contents of the low-order 32 bits of a register to be performed (in 
concept) by a left-rotation of 32-n, where n is the number of bits by 
which to rotate right. 

Rotate Left Doubieword Immediate then Clear Left MD-form 



rldicl 
rldicl. 



RA,RS,SH,MB 
RA,RS,SH,MB 



(Rc=0) 
(Rc=l) 



30 



RS 



RA 



sh 



16 



mb 



27 



n <r- sh5 II sho.4 
r <- R0TL64((RS), n) 
b <- mb5 II nibo:4 
m <- MASK(b, 63) 
RA ^ r & m 

The contents of register RS are rotated54 left SH bits. A mask is gener- 
ated having 1-bits from bit MB through bit 63 and 0-bits elsewhere. The 
rotated data are ANDed with the generated mask and the result is placed 
into register RA. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 



Programming Note 

rldicl can be used to 
extract an n-bit field that 
starts at bit position b in 
register RS, right- 
justified into register RA 
(clearing the remaining 
64-n bits of RA), by 
setting SH=b+n and 
MB=64-n. It can be used 
to rotate the contents of 
a register left (right) by n 
bits by setting SH=n 
(64-n) and MB=0. It can 
be used to shift the 
contents of a register 
right by n bits by setting 
SH=64-n and MB=n. It 
can be used to clear the 
high-order n bits of a 
register by setting SH=0 
and MB=n. 

Extended mnemonics are 
provided for all of these 
uses: see Appendix C, 
"Assembler Extended 
Mnemonics/' on 
page 215. 
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handler to be invoked. 



Special Registers Altered 

CRO 



(1f Rc=l) 



Programming Note 

rldicr can be used to 
extract an n-bit field that 
starts at bit position b in 
register RS, left-justified 
into register RA (clearing 
the remaining 64-n bits 
of RA), by setting SH=b 
and ME=n-l. It can be 
used to rotate the 
contents of a register left 
(right) by n bits by setting 
SH=n (64-n) and ME=63. 
It can be used to shift the 
contents of a register left 
by n bits by setting SH=n 
and ME=63-n. It can be 
used to clear the low- 
order n bits of a register 
by setting SH=0 and 
ME=63-n. 

Extended mnemonics are 
provided for all of these 
uses (some devolve to 
ridici): see Appendix C, 
"Assembler Extended 
Mnemonics," on 
page 215. 



Extended Mnemonics: 

Examples of extended mnemonics for Rotate Left Doubleword Immedi- 
ate then Clear Left: 



Extended: 
extrdi Rx,Ry,n,b 
srdi Rx,Ry,n 
clrldi Rx,Ry,n 



Equivalent to: 
rldicl Rx5Ry,b+n,64-n 
rldicl Rx,Ry,64-n,n 
ridici Rx,Ry,0,n 



Rotate Left Doubleword Immediate then 
Clear Right MD-form 



rldicr 
rldicr. 



RA,RS,SH,ME 
RA,RS,SH,ME 



(Rc=0) 
(Rc=l) 



30 



RS 



RA 



11 



sh 



16 



me 



21 



27 



n <- shs II sho:4 
r <- R0TL64((RS), n) 
e <r- me5 || meo:4 
m <r- MASKCO, e) 
RA r & m 

The contents of register RS are rotated54 left SH bits. A mask is gener- 
ated having 1-bits from bit 0 through bit ME and 0-bits elsewhere. The 
rotated data are ANDed with the generated mask and the result is placed 
into register RA. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO 



(if Rc=l) 
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Extended Mnemonics: 

Examples of extended mnemonics for Rotate Left Doubleword Immedi- 
ate then Clear Right: 



Extended: 
extldi Rx,Ry,n,b 
sldi Rx,Ry,n 
clrrdi Rx,Ry,n 



Equivalent to: 
rldicr Rx,Ry,b,n-l 
rldicr Rx,Ry,n,63-n 
rldicr Rx,Ry,0,63-n 



Rotate Left Doubleword Immediate then Clear MD-form 



rldic 
rldic. 



RA,RS,SH,MB 
RA,RS,SH,MB 



(Rc=0) 
(Rc=l) 



30 



RS 



RA 



sh 



16 



mb 



21 



27 



shRc 

30 31 



n <r- sh5 II sho:4 
r <- R0TL64((RS), n) 
b <- mb5 II mbo:4 
m ^ MASK(b, ^n) 
RA <- r & m 

The contents of register RS are rotated^4 left SH bits. A mask is gen- 
erated having 1-bits from bit MB through bit 63-SH and 0-bits else- 
where. The rotated data are ANDed with the generated mask and the 
result is placed into register RA. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Programming Note 

rldic can be used to clear 
the high-order b bits of 
the contents of a register 
and then shift the result 
left by n bits by setting 
SH=n and MB=b-n. It can 
be used to clear the high- 
order n bits of a register 
by setting SH=0 and 
MB=n. 

Extended mnemonics are 
provided for both of 
these uses (the second 
devolves to ridici): see 
Appendix Q "Assembler 
Extended Mnemonics," 
on page 215. 



Special Registers Altered 

CRO 



(if Rc=l) 



Extended Mnemonics: 

Example of extended mnemonics for Rotate Left Doubleword Immediate 

then Clear: 



Extended: 
clrlsldi Rx,Ry,b,n 



Equivalent to: 
rldic Rx,Ry,n,b-n 
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Rotate Left Word Immediate then AND with Mask iVI-form 



rlwinm RA,RS,SH,MB,ME 
rlwinm. RA,RS,SH,MB,ME 

[Power mnemonics: rlinm, rlinm.] 



(Rc=0) 
(Rc=l) 



21 


RS 


RA 


SH 


MB 


ME 


Rc 


0 


6 


11 


16 


21 


26 


31 



n f- SH 

r <- R0TL32((RS)32:63, n) 



m <r- MASK(MB+32, 
RA <- r & m 



ME+32) 



The contents of register RS are rotated32 left SH bits. A mask is gener- 
ated having 1-bits from bit MB+32 through bit ME+32 and 0-bits else- 
where. The rotated data are ANDed with the generated mask and the 
result is placed into register RA. 



Special Registers Altered 

CRO 



(if Rc=l) 



Extended Mnemonics: 

Examples of extended mnemonics for Rotate Left Word Immediate then 
AND with Mask: 



Extended: 
extlwi Rx,Ry,n,b 
srwi Rx,Ry,n 
clrrwi Rx,Ry,n 



Equivalent to: 
rlwinm Rx,Ry,b,0,n-l 
rlwinm Rx,Ry,32-n,n,31 
rlwinm Rx,Ry,0,0,31-n 



Rotate Left Doubleword then Clear Left MDS-form 



rldcl 
rldcl. 



RA,RS,RB,MB 
RA,RS,RB,MB 



(Rc=0) 
(Rc=l) 



30 


RS 


RA 


RB 


mb 


8 


Rc 


0 


6 


11 


16 


21 


27 


31 



n <- (RB)58:63 

r ^ R0TL64((RS), n) 
b <r- mb5 II mbo:4 
m <- MASK(b, 63) 
RA ^ r & m 



Programming Note 

Let RSL represent the 
low-order 32 bits of 
register RS, with the bits 
numbered from 0 
through 31. 

rlwinm can be used to 
extract an n-bit field that 
starts at bit position b in 
RSL, right-justified into 
the low-order 32 bits of 
register RA (clearing the 
remaining 32-n bits of 
the low-order 32 bits of 
RA), by setting SH=b+n, 
MB=32-n, and ME=31. It 
can be used to extract an 
n-bit field that starts at 
bit position b in RSL, left- 
justified into the low- 
order 32 bits of register 
RA (clearing the 
remaining 32-n bits of 
the low-order 32 bits of 
RA), by setting SH=5, 
MB = 0, andME=n-1.lt 
can be used to rotate the 
contents of the low-order 
32 bits of a register left 
(right) by n bits by setting 
SH=n (32-n), MB=0, and 
ME=31. It can be used to 
shift the contents of the 
low-order 32 bits of a 
register right by n bits by 
setting SH=32-n, MB=n, 
and ME=31. It can be 
used to clear the high- 
order b bits of the low- 
order 32 bits of the 
contents of a register and 
then shift the result left 
by n bits by setting SH=n, 
MB=b-n and ME=31-n. 
It can be used to clear the 
low-order n bits of the 
low-order 32 bits of a 
register by setting SH=0, 
MB=0, and ME=31-n. 

For all the uses given 
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above, the high-order 32 
bits of register RA are 
cleared. 

Extended mnemonics are 
provided for all of these 
uses: see Appendix C, 
"Assembler Extended 
Mnemonics/' on 
page 215. 

Programming Note 

ridcl can be used to 
extract an n-bit field that 
starts at variable bit 
position b in register RS, 
right-justified into 
register RA (clearing the 
remaining 64-n bits of 
RA), by setting 
RB58:63='fc>+n and MB=64- 
n. It can be used to rotate 
the contents of a register 
left (right) by variable n 
bits by setting RB58:63=n 
(64-n) and MB=0. 

Extended mnemonics are 
provided for some of 
these uses: see Appendix 
C, "Assembler Extended 
Mnemonics," on 
page 215. 

Programming Note 

rider can be used to 
extract an n-bit field that 
starts at variable bit 
position b in register RS, 
left-justified into register 
RA (clearing the 
remaining 64-n bits of 
RA), by setting RB58:63=ib 
and ME=n-1. It can be 
used to rotate the 
contents of a register left 
(right) by variable n bits 
by setting RB58:63=n (64- 
n) and ME=63. 

Extended mnemonics are 



The contents of register RS are rotated54 left the number of bits speci- 
fied by (RB)53.53. A mask is generated having 1-bits from bit MB through 
bit 65 and 0-bits elsewhere. 1 he rotated data are ANDed with the gener- 
ated mask and the result is placed into register RA. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO 



(if Rc=l) 



Extended Mnemonics: 

Example of extended mnemonics for Rotate Left Doubleword then Clear 
Left: 

Extended: Equivalent to: 

rotld Rx,Ry,Rz rldcl Rx,Ry,Rz,0 



Rotate Left Doubleword then Clear Right MDS-form 



rider 
rider. 



RA,RS,RB,ME 
RA,RS,RB,ME 



(Rc=0) 
(Rc=l) 



30 


RS 


RA 


RB 


me 


9 


Rc 


0 


6 


11 


16 


21 


27 


31 



n <- (RB)58:63 

r ^ R0TL64((RS), n) 
e <- mes || meo:4 
m <r- MASK(0, e) 
RA <- r & m 



The contents of register RS are rotated54 left the number of bits speci- 
fied by (RB)58.53. A mask is generated having 1-bits from bit 0 through 
bit ME and 0-bits elsewhere. The rotated data are ANDed with the gener- 
ated mask and the result is placed into register RA. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO 



(if Rc=l) 
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provided for some of 
these uses (some devolve 
to rldcl): see Appendix C, 
"Assembler Extended 
Mnemonics," on 
page 215. 

Programming Note 

Let RSL represent the 
low-order 32 bits of 
register RS, with the bits 
numbered from 0 
through 31. 

rlwnm can be used to 
extract an n-bit field that 
starts at variable bit 
position b in RSL, right- 
justified into the low- 
order 32 bits of register 
RA (clearing the 
remaining 32-n bits of 
the low-order 32 bits of 
RA), by setting 
RB59.63=b+a MB=32-n, 
and ME=31. It can be 
used to extract an n-bit 
field that starts at 
variable bit position b in 
RSL, left-justified into 
the low-order 32 bits of 
register RA (clearing the 
remaining 32-n bits of 
the low-order 32 bits of 
RA), by setting RB 59.53=5, 
MB = 0, and ME=n-1.lt 
can be used to rotate the 
contents of the low-order 
32 bits of a register left 
(right) by variable n bits 
by setting RB59.63=n (32- 
n), MB=0, and ME=31. 

For all the uses given 
above, the high-order 32 
bits of register RA are 
cleared. 

Extended mnemonics are 
provided for some of 
these uses: see Appendix 
C, "Assembler Extended 



Rotate Left Word then AND with Mask M-form 

rlwnm RA,RS,RB,MB,ME 
rlwnm. RA,RS,RB,MB,ME 

[Power mnemonics: rlnm, rlnm.] 



(Rc=0) 
(Rc=l) 



23 


RS 


RA 


RB 


MB 


ME 


Rc 


0 


6 


11 


16 


21 


26 


31 



n <- (RB)59.63 
r <r- R0TL32((RS)32:63. n) 
m ^ MASK(MB+32. ME+32) 
RA ^ r & m 

The contents of register RS are rotated32 left the number of bits speci- 
fied by (RB)59.53. A mask is generated having 1-bits from bit MB+32 
through bit ME+32 and 0-bits elsewhere. The rotated data are ANDed 
with the generated mask and the result is placed into register RA. 



Special Registers Altered 

CRO 



(if Rc=l) 



Extended Mnemonics: 

Example of extended mnemonics for Rotate Left Word then AND with 
Mask: 

Extended: Equivalent to: 

rotlw Rx,Ry,Rz rlwnm Rx,Ry,Rz,0,31 



Rotate Left Doubleword Immediate then 
Mask Insert MD-form 



rldimi 
rldimi. 



RA,RS,SH,MB 
RA,RS,SH,MB 



(Rc=0) 
(Rc=l) 



30 



RS 



RA 



sh 



16 



mb 



Rc 



sh 

30 31 



n <- sh5 II sho.4 
r ^ R0TL64((RS), n) 
b <- mb5 II mbo:4 
m ^ MASK(b, -^n) 
RA <- r&m | (RA)&-,m 

The contents of register RS are rotated54 left SH bits. A mask is gen- 
erated having 1-bits from bit MB through bit 63-SH and 0-bits else- 
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Mnemonics/' on 
page 215. 

Programming Note 

ridimi can be used to 
insert an n-bit field that is 
right-justified in register 
RS into register RA 
starting at bit position b 
by setting SH=64-(ib+n) 
and MB=ifc>. 

An extended mnemonic 
is provided for this use: 
see Appendix C, 
"Assembler Extended 
Mnemonics/' on 
page 215. 

Programming Note 

Let RAL represent the 
low-order 32 bits of 
register RA, with the bits 
numbered from 0 
through 31. 

rlwimi can be used to 
insert an n-bit field that is 
left-justified in the low- 
order 32 bits of register 
RS into RAL starting at bit 
position b by setting 
SH=32-ib, MB=ib, and 
ME=(6+n)-1. It can be 
used to insert an n-bit 
field that is right-justified 
in the low-order 32 bits 
of register RS into RAL 
starting at bit position b 
by setting SH=32-(5+n), 
MB=6, andME=(b+n)-1. 

Extended mnemonics are 
provided for both of 
these uses: see Appendix 
C, "Assembler Extended 
Mnemonics/' on 
page 215. 



where. The rotated data are inserted into register RA under control of 
the generated mask. 

This irxStruction is ucfnicu uiily for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO 



(if Rc=l) 



Extended Mnemonics: 

Example of extended mnemonics for Rotate Left Doubleword Immediate 

then Mask Insert: 

Extended: Equivalent to: 

insrdi Rx,Ry,n,b ridimi Rx,Ry,64-(b+n),b 



Rotate Left Word Immediate then iViasIc insert iVI-form 



rlwimi RA,RS,SH,MB,ME 
rlwimi. RA,RS,SH,MB,ME 

[Power mnemonics: rlimi, rlimi.] 



(Rc=0) 
(Rc=l) 



20 


RS 


RA 


SH 


MB 


ME 


Rc 


0 


6 


11 


16 


21 


26 


31 



n <- SH 

r <r- R0TL32( (RS)32:63. n) 
m <- MASK(MB+32, ME+32) 
RA <- r&m | (RA)&-.m 

The contents of register RS are rotated32 left SH bits. A mask is gener- 
ated having 1-bits from bit MB+32 through bit ME+32 and 0-bits else- 
where. The rotated data are inserted into register RxA under control of the 
generated mask. 



Special Registers Altered 

CRO 



(if Rc=l) 
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Extended Mnemonics: 

Example of extended mnemonics for Rotate Left Word Immediate then 

Mask Insert: 

Extended: Equivalent to: 

inslwi Rx,Ry,n,b rlwimi Rx,Ry,32-b,b,b+n-l 



Programming Note 

Multiple-precision shifts 
can be programmed as 
shown in Appendix E.2, 
"Multiple-Precision 
Shifts," on page 256. 



Fixed-Point Shift instructions 

The instructions in this section perform left and right shifts. 
Extended iVInemonlcs for Shifts 

Immediate-form logical (unsigned) shift operations are obtained by speci- 
fying appropriate masks and shift values for certain Rotate instructions. 
A set of extended mnemonics is provided to make coding of such shifts 
simpler and easier to understand. Some of these are shown as examples 
with the Rotate instructions. See Appendix C, "Assembler Extended 
Mnemonics," on page 215 for additional extended mnemonics. 



Shift Left Doubleword X-form 



sld 
sld. 



RA,RS,RB 
RA,RS,RB 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


RB 


27 


Rc 


0 


6 


11 


16 


21 


31 



n <r- (RB)58:63 

r <r- R0TL64((RS), n) 
if (RB)57 = 0 then 

m ^ MASK(0, 63-n) 
el se m <- 
RA f- r 



64o 

m 



The contents of register RS are shifted left the number of bits specified 
by (RB)57.53. Bits shifted out of position 0 are lost. Zeros are supplied 
to the vacated positions on the right. The result is placed into register RA. 
Shift amounts from 64 to 127 give a zero result. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CRO 



(if Rc=l) 
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Shift Left Word X-form 



slw 
slw. 



RA,RS,RB 
RA,RS,RB 



[Power mnemonics: si, sL] 



(Rc-0) 
(Rc=l) 



31 


RS 


RA 


RB 


24 


Rc 


0 


6 


11 


16 


21 


31 



n ^ (RB)59.63 

r <r- R0TL32((RS)32:63 , n) 

if (RB)58 = 0 then 

m <r- MASK(32, 63-n) 
else m «- ^^0 
RA <- r & m 

The contents of the low-order 32 bits of register RS are shifted left the 
number of bits specified by (RB)58.53. Bits shifted out of position 32 are 
lost. Zeros are supplied to the vacated positions on the right. The 32-bit 
result is placed into RA32:53. RAqjSi are set to zero. Shift amounts from 
32 to 63 give a zero result. 



Special Registers Altered 

CRO 



(if Rc=l) 



Shift Right Doubleword X-form 



srd 
srd. 



RA,RS,RB 
RA,RS,RB 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


RB 


539 


Rc 


0 


6 


11 


16 


21 


31 



n <- (RB)58:63 

r <- R0TL64((RS), 64-n) 
if (RB)57 = 0 then 

m <- MASK(n, 63) 
else m <- ^"^0 
RA <- r & m 

The contents of register RS are shifted right the number of bits speci- 
fied by (RB)57.53. Bits shifted out of position 63 are lost. Zeros are sup- 
plied to the vacated positions on the left. The result is placed into register 
RA. Shift amounts from 64 to 127 give a zero result. 

This instruction is defined only for 64-bit implementations. Using it on 
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a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

CRO 

Shift Right Word X-form 

srw RA,RS,RB 
srw. RA,RS,RB 

[Power mnemonics: sr, sr.] 



31 


RS 


RA 


RB 


536 


Rc 


0 


6 


11 


16 


21 


31 



n <r- (RB)59.63 

r <- R0TL32( (RS)32:63. 64-in) 
if (RB)58 = 0 then 

m <r- MASK(n+32, 63) 
el se m <- ^^0 
RA <- r & m 

The contents of the low-order 32 bits of register RS are shifted right 
the number of bits specified by (RB)5g.53. Bits shifted out of position 63 
are lost. Zeros are supplied to the vacated positions on the left. The 32- 
bit result is placed into RA32:63. RAo:3i are set to zero. Shift amounts 
from 32 to 63 give a zero result. 

Special Registers Altered 

CRO (1f Rc=l) 



(if Rc=l) 

(Rc=0) 
(Rc=l) 
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Shift Right Algebraic Doubleword immediate XS-form 



sradi 
sradi. 



RA,RS,SH 
RA,RS,SH 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


sh 


413 


sh 


Rc 


0 


6 


11 


16 


21 


30 


31 



n shs II sho:4 

r <- R0TL64((RS), 64-n) 

m ^ MASKCn. 63) 

s <r- (RS)o 



RA 

CA 



r&m 



s & ((r&-im)9tO) 



The contents of register RS are shifted right SH bits. Bits shifted out of 
position 63 are lost. Bit 0 of RS is rephcated to fill the vacated positions 
on the left. The result is placed into register RA. CA is set to 1 if (RS) is 
negative and any 1-bits are shifted out of position 63; otherwise CA is set 
to 0. A shift amount of zero causes RA to be set equal to (RS), and CA to 
be set to 0. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Programming Note 

Any Shift Right Algebraic 
iii^Li ui.lluri, fuiiowed by 
addze, can be used to 
divide quickly by 2^. The 
setting of the CA bit by 
the Shift Right Algebraic 
instructions is 
independent of mode. 



Special Registers Altered 

CA 
CRO 



(if Rc=l) 



Shift Right Algebraic Word immediate X-form 



srawi 
srawi. 



RA,RS,SH 
RA,RS,SH 



[Power mnemonics: srai, srai. 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


SH 


824 


Rc 


0 


6 


11 


16 


21 


31 



n <- SH 

r <r- R0TL32((RS)32.63 , 64-n) 
m <- MASK(n+32, 63) 

s <r^ (RS)32 

RA <- r&m | (^'^s)&-.m 

CA <- s & ( (r&~,m)32:63'^0) 
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The contents of the low-order 32 bits of register RS are shifted right 
SH bits. Bits shifted out of position 63 are lost. Bit 32 of RS is replicated 
to fill the vacated positions on the left. The 32-bit result is placed into 
RA32:63. Bit 32 of RS is repHcated to fill RAo:3i. CA is set to 1 if the low- 
order 32 bits of (RS) contain a negative number and any 1-bits are shifted 
out of position 63; otherwise CA is set to 0. A shift amount of zero 
causes RA to receive EXTS((RS) 32:63), and CA to be set to 0. 



Special Registers Altered 

CA 
CRO 



(if Rc=l) 



Shift Right Algebraic Doubleword X-f orm 



srad 
srad. 



RA,RS,RB 
RA,RS,RB 



(Rc=0) 
(Rc=l) 



31 


RS 


RA 


RB 


794 


Rc 


0 


6 


11 


16 


21 


31 



n <- (RB)58:63 

r <r- R0TL64((RS). 64-n) 
if (RB)57 = 0 then 

m <- MASK(n, 63) 
else m ^ ^^0 
s <r- (RS)o 

RA <- r&m | (^4s)&-,m 
CA <- s & ((r&^m)9tO) 

The contents of register RS are shifted right the number of bits speci- 
fied by (RB)57.53. Bits shifted out of position 63 are lost. Bit 0 of RS is 
replicated to fill the vacated positions on the left. The result is placed into 
register RA. CA is set to 1 if (RS) is negative and any 1-bits are shifted 
out of position 63; otherwise CA is set to 0. A shift amount of zero 
causes RA to be set equal to (RS), and CA to be set to 0. Shift amounts 
from 64 to 127 give a result of 64 sign bits in RA, and cause CA to 
receive the sign bit of (RS). 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

CA 
CRO 



(if Rc=l) 



Book I PowerPC User Instruction Set Architecture 



128 



Chapter 3 Fixed-Point Processor 



Shift Right Algebraic Word X-form 

sraw RA,RS,RB (Rc=0) 

sraw. RA,RS,RB (Rc=l) 

[Power mnemonics: sra, sra.] 



31 


RS 


RA 


RB 


792 


Rc 


0 


6 


11 


16 


21 


31 



n <- (RB)59.63 

r <r- R0TL32((RS)32:63. 64-n) 
if (RB)58 = 0 then 

m <- MASK(n+32, 63) 
else m ^ ^"^0 
s <- (RS)32 
RA <r- r&m | (^'^s)S,-nm 
CA <- s & ((r&-.m)32:639^0) 

The contents of the low-order 32 bits of register RS are shifted right 
the number of bits specified by (RB)58.53. Bits shifted out of position 63 
are lost. Bit 32 of RS is replicated to fill the vacated positions on the left. 
The 32-bit result is placed into RA32:63. Bit 32 of RS is replicated to fill 
RAoj3i. CA is set to 1 if the low-order 32 bits of (RS) contain a negative 
number and any 1-bits are shifted out of position 63; otherwise CA is set 
to 0. A shift amount of zero causes RA to receive EXTS((RS)32;63), and 
CA to be set to 0. Shift amounts from 32 to 63 give a result of 64 sign 
bits, and cause CA to receive the sign bit of (RS)32:63- 

Special Registers Altered 

CA 

CRO (if Rc=l) 



3.3.14 Move to/from System Register 
Instructions 

Extended iVinemonics: 

A set of extended mnemonics is provided for the mtspr and mfspr instruc- 
tions so that they can be coded with the SPR name as part of the mne- 
monic rather than as a numeric operand. Some of these are shown as 
examples with the two instructions. See Appendix C, "Assembler 
Extended Mnemonics," on page 215 for additional extended mnemonics. 
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Compiler and Assem- 
bler Note 

For the mtspr and mfspr 
instructions, the SPR 
number coded in 
assembler language does 
not appear directly as a 
10-bit binary number in 
the instruction. The 
number coded is split 
into two 5-bit halves that 
are reversed in the 
instruction, with the 
high-order 5 bits 
appearing in bits 16:20 of 
the instruction and the 
low-order 5 bits in bits 
11:15. This maintains 
compatibility with 
POWER SPR encodings, 
in which these two 
instructions have only a 
5-bit SPR field occupying 
bits 11:15. 

Compatibility Note 

For a discussion of 
POWER compatibility 
with respect to SPR 
numbers not shown in 
the instruction 
descriptions for mtspr 
and mfspr, please refer to 
Appendix G, 
"Incompatibilities with 
the POWER 
Architecture," on 
page 271. For 
compatibility with future 
versions of this 
architecture, only SPR 
numbers discussed in 
these instruction 
descriptions should be 
used. 



Move To Special Purpose Register XFX-form 

mtspr SPR,RS 





31 


RS 




spr 




467 




/ 


0 




6 


11 




21 






31 



n <- spps-g II spro:4 

if length(SPREG(n)) = 64 then 

SPREG(n) <- (RS) 
el se 

SPREG(n) ^ (RS)32:63{0:31} 

The SPR field denotes a Special Purpose Register, encoded as shown in 
the table below. The contents of register RS are placed into the desig- 
nated Special Purpose Register. For Special Purpose Registers that are 32 
bits long, the low-order 32 bits of RS are placed into the SPR. 



decimal 


SPR* 

Spr5:9 Spro:4 


Register name 


1 


00000 00001 


XER 


8 


00000 01000 


LR 


9 


00000 01001 


CTR 



Note that the order of the two 5-bit halves of the SPR number is reversed. 



If the SPR field contains any value other than one of the values shown 
above then one of the following occurs. 

■ The system illegal instruction error handler is invoked. 

■ The system privileged instruction error handler is invoked. 

■ The results are boundedly undefined. 

A complete description of this instruction can be found in Book III, 
Section 3.4.1, "Move to/from System Register Instructions," on 
page 384. 

Special Registers Altered 

See above 
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Extended Mnemonics: 

Examples of extended mnemonics for Move To Special Purpose Register: 



Extended: 
mtxer Rx 
mtlr Rx 
mtctr Rx 



Equivalent to: 
mtspr l,Rx 
mtspr 8,Rx 
mtspr 9,Rx 



Move From Special Purpose Register XFX-form 

mfspr RT,SPR 





31 


RT 




spr 




339 


/ 


0 




6 


11 




21 




31 



n f- sprs-g II spro:4 

if length(SPREG(n)) = 64 then 

RT <- SPREG(n) 
el se 

RT ^ II SPREG(n) 

The SPR field denotes a Special Purpose Register, encoded as shown in 
the table below. The contents of the designated Special Purpose Register 
are placed into register RT. For Special Purpose Registers that are 32 bits 
long, the low-order 32 bits of RT receive the contents of the Special Pur- 
pose Register and the high-order 32 bits of RT are set to zero. 





SPR* 




decimal 


spr5:9 spro.4 


Register name 


1 


00000 00001 


XER 


8 


00000 01000 


LR 


9 


00000 01001 


CTR 



Si- 
Note that the order of the two 5-bit halves of the SPR number is reversed. 



If the SPR field contains any value other than one of the values shown 
above then one of the following occurs. 

■ The system illegal instruction error handler is invoked. 

■ The system privileged instruction error handler is invoked. 
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■ The results are boundedly undefined. 

A complete description of this instruction can be found in Book III, 
Section 3.4.1, "Move to/from System Register Instructions," on 
page 384. 

Special Registers Altered 

None 



Compiler/Assembler/ 
Compatibility Notes 

See the Notes that 
appear with mtspr in 
"Move To Special Purpose 
Register XFX-form," on 
page 129. 



Extended Mnemonics: 

Examples of extended mnemonics for Move From Special Purpose Regis- 
ter: 



Extended: 
mfxer Rx 
mflr Rx 
mfctr Rx 



Equivalent to: 
mfspr Rx,l 
mfspr Rx,8 
mfspr Rx,9 



Move To Condition Register Fields XFX-form 

mtcrf FXM,RS 





31 


RS 


/ 




FXM 


/ 




144 


/ 


0 




6 


11 


12 




20 


21 




31 



mask <- ^(FXMq) || "^(FXMi) || ... ^(FXMy) 
CR <r- ((RS)32:63 & mask) | (CR & -,mask) 

The contents of bits 32:63 of register RS are placed into the Condition 
Register under control of the field mask specified by FXM. The field mask 
identifies the 4-bit fields affected. Let i be an integer in the range 0-7. If 
FXM(i) = 1 then CR field i (CR bits 4xi through 4xi+3) is set to the con- 
tents of the corresponding field of the low-order 32 bits of RS. 



Programming Note 

Updating a proper 
subset of the eight fields 
of the Condition Register 
may result in 
substantially poorer 
performance on some 
implementations than 
updating all of the fields. 



Special Registers Altered 

CR fields selected by mask 
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Move to Condition Register from XER X-form 

mcrxr BF 



31 


BF 


// 


III 


III 


512 


/ 


0 


6 


9 


11 


16 


21 


31 



CR4xBF:4xBF+3 ^ XERq-s 
XERo:3 ObOOOO 

The contents of XERo:3 are copied into the Condition Register field 
designated by BF. XERq.3 are set to zero. 

Special Registers Altered 

CR XERo:3 

iViove From Condition Register X-form 

mfcr RT 



31 


RT 


III 


III 


19 


/ 


0 


6 


11 


16 


21 


31 



RT <- ^^0 II CR 

The contents of the Condition Register are placed into RT32:53. 
RTo;3i are set to 0. 

Special Registers Altered 

None 
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4.1 Floating-Point Processor Overview 

This chapter describes the registers and instructions that make up the 
Floating-Point Processor facility. Section 4.2, "Floating-Point Processor 
Registers," on page 135 describes the registers associated with the Float- 
ing-Point Processor. Section 4.6, "Floating-Point Processor Instructions," 
on page 167 describes the instructions associated with the Floating-Point 
Processor. 

This architecture specifies that the processor implement a floating- 
point system as defined in ANSI/IEEE Standard 754-1985, "IEEE Stan- 
dard for Binary Floating-Point Arithmetic" (hereafter referred to as "the 
IEEE standard"), but requires software support in order to conform fully 
with that standard. That standard defines certain required "operations" 
(addition, subtraction, etc.); the term "floating-point operation" is used 
in this chapter to refer to one of these required operations, or to the oper- 
ation performed by one of the Multiply -Add or Reciprocal Estimate 
instructions. All floating-point operations conform to that standard, 
except if software sets the Floating-Point Non-IEEE Mode (NI) bit in the 
Floating-Point Status and Control Register to 1 (see page 140), in which 
case floating-point operations do not necessarily conform to that stan- 
dard. 

Instructions are provided to perform arithmetic, rounding, conversion, 
comparison, and other operations in floating-point registers; to move 
floating-point data between storage and these registers; and to manipu- 
late the Floating-Point Status and Control Register explicitly. 

These instructions are divided into two categories. 
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■ computational instructions 

The computational instructions are those that perform addition, sub- 
traction, iiiuluplicaiion, division, extracting the square root, rounding, 
conversion, comparison, and combinations of these operations. These 
instructions provide the floating-point operations. They place status 
information into the Floating-Point Status and Control Register. They 
are the instructions described in Sections 4.6.5 through 4.6.7 and 
Appendix A. 1.2. 

■ non-computational instructions 

The non-computational instructions are those that perform loads and 
stores, move the contents of a floating-point register to another float- 
ing-point register possibly altering the sign, manipulate the Floating- 
Point Status and Control Register explicitly, and select the value from 
one of two floating-point registers based on the value in a third float- 
ing-point register. The operations performed by these instructions are 
not considered floating-point operations. With the exception of the 
instructions that manipulate the Floating-Point Status and Control 
Register explicitly, they do not alter the Floating-Point Status and 
Control Register. They are the instructions described in Sections 4.6.2 
through 4.6.4, 4.6.8, and Appendix A.1.3. 

A floating-point number consists of a signed exponent and a signed 
significand. The quantity expressed by this number is the product of the 
significand and the number 2^^P^^^^^. Encodings are provided in the data 
format to represent finite numeric values, ± Infinity, and values that are 
"Not a Number" (NaN). Operations involving infinities produce results 
obeying traditional mathematical conventions. NaNs have no mathemat- 
ical interpretation. Their encoding permits a variable diagnostic informa- 
tion field. They may be used to indicate such things as uninitialized 
variables and can be produced by certain invalid operations. 

There is one class of exceptional events that occur during instruction 
execution that is unique to the Floating-Point Processor: the Floating- 
Point Exception. Floating-point exceptions are signaled with bits set in 
the Floating-Point Status and Control Register (FPSCR). They can cause 
the system floating-point enabled exception error handler to be invoked, 
precisely or imprecisely, if the proper control bits are set. 
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Floating-Point Exceptions 

The following floating-point exceptions are detected by the processor: 



■ 


Invalid Operation Exception 


(VX) 




CXT XT 

SNaN 


(VXSNAN) 




Innnity - Innnity 


(VXISI) 




mnnity -5- Innnity 


/■\ 7"VT"rM\ 

(VXIDI) 










Infinity x Zero 


(VXIMZ) 




Invalid Compare 


(VXVC) 




Software Request 


(VXSOFT) 




Invalid Square Root 


(VXSQRT) 




Invalid Integer Convert 


(VXCVI) 


■ 


Zero Divide Exception 


(ZX) 


■ 


Overflow Exception 


(OX) 


■ 


Underflow Exception 


(UX) 


■ 


Inexact Exception(XX) 





Each floating-point exception, and each category of Invalid Operation 
Exception, has an exception bit in the FPSCR. In addition, each floating- 
point exception has a corresponding enable bit in the FPSCR. See Section 
4.2.2, "Floating-Point Status and Control Register," on page 137, for a 
description of these exception and enable bits, and Section 4.4, "Floating- 
Point Exceptions," on page 150, for a detailed discussion of floating- 
point exceptions, including the effects of the enable bits. 

4.2 Floating-Point Processor Registers 
4.2.1 Floating-Point Registers 

Implementations of this architecture provide 32 floating-point registers 
(FPRs). The floating-point instruction formats provide 5-bit fields for 
specifying the FPRs to be used in the execution of the instruction. The 
FPRs are numbered 0-31. See Figure 23 on page 136. 
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Each FPR contains 64 bits that support the floating-point double for- 
mat. Every instruction that interprets the contents of an FPR as a float- 
ing-point value UGCG the floating-puinl uuuble format for this 
interpretation. 

The computational instructions, and the Move and Select instructions, 
operate on data located in FPRs and, with the exception of the Compare 
instructions, place the result value into an FPR and optionally place sta- 
tus information into the Condition Register. 

Load and store double instructions are provided that transfer 64 bits 
of data between storage and the FPRs with no conversion. Load single 
instructions are provided to transfer and convert floating-point values in 
floating-point single format from storage to the same value in floating- 
point double format in the FPRs. Store single instructions are provided to 
transfer and convert floating-point values in floating-point double format 
from the FPRs to the same value in floating-point single format in stor- 
age. 

Instructions are provided that manipulate the Floating-Point Status 
and Control Register and the Condition Register explicitly. Some of these 
instructions copy data from an FPR to the Floating-Point Status and Con- 
trol Register or vice versa. 

The computational instructions and the Select instruction accept val- 
ues from the FPRs in double format. For single-precision arithmetic 
instructions, all input values must be representable in single format. If 
they are not, the result placed into the target FPR, and the setting of sta- 
tus bits in the FPSCR and in the Condition Register (if Rc=l) are unde- 
fined. 

The arithmetic, rounding, and conversion instructions produce inter- 
mediate results that may be regarded as being infinitely precise. After nor- 
malization or denormalization, if the infinitely precise intermediate result 
is not representable in the destination format (either 32-bit or 64-bit) 
then it is rounded. The final result is then placed into the FPR in the dou- 
ble format. 



FPR 00 
FPR 01 



FPR 30 

FPR 31 

0 63 
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4.2.2 Floating-Point Status and Control 
Register 

The Floating-Point Status and Control Register (FPSCR) controls the 
handling of floating-point exceptions and records status resulting from 
the floating-point operations. Bits 0:23 are status bits. Bits 24:31 are con- 
trol bits. 

The exception bits in the FPSCR (bits 0:12, 21:23) are sticky, with the 
exception of Floating-Point Enabled Exception Summary (FEX) and 
Floating-Point Invalid Operation Exception Summary (VX). That is, once 
set, the sticky bits remain set until they are cleared by an mcrfs, mtfsfi, 
mtfsf, or mtfsbO instruction. 

FEX and VX are simply the ORs of other FPSCR bits. Therefore these 
two bits are not listed among the FPSCR bits affected by the various 
instructions. 



FPCSR 

0 31 



Figure 24. Floating-Point Status and Control Register 

The format of the FPSCR is: 
Bit(s) Description 



0 Floating-Point Exception Summary (FX) 

Every floating-point instruction, except mtfsfi and mtfsf^ implicitly 
sets FPSCRpx 1 if ^hat instruction causes any of the floating- 
point exception bits in the FPSCR to change from 0 to 1. mcrfs, 
mtfsfi, mtfsf mtfsbOy and mtfsbl can alter FPSCRpx explicitly. 

1 Floating-Point Enabled Exception Summary (FEX) 

This bit indicates whether any enabled exceptions have occurred. 
It is the OR of all the floating-point exception bits masked by their 
respective enable bits, mctfs, mtfsfi, mtfsf mtfsbO, and mtfsbl can- 
not alter FPSCRpEx explicitly. 

2 Floating-Point Invalid Operation Exception Summary (VX) 
This bit indicates whether any invalid operation exceptions have 
occurred. It is the OR of all the Invalid Operation exception bits. 
mcrfs, mtfsfi, mtfsf mtfsbO, and mtfsbl cannot alter FPSCRyx ex- 
plicitly. 
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3 Floating-Point Overflow Exception (OX) 

See Section 4.4.3, "Overflow Exception," on page 159. 

4 Floating-Point Underflow Exception (UX) 

See Section 4.4.4, "Underflow Exception," on page 160. 

5 Floating-Point Zero Divide Exception (ZX) 

See Section 4.4.2, "Zero Divide Exception," on page 158. 

6 Floating-Point Inexact Exception (XX) 

See Section 4.4.5, "Inexact Exception," on page 162. 

FPSCRxx is a sticky version of FPSCRpi (see below). Thus the 
following rules completely describe how FPSCRxx ^7 ^ 

given instruction. 

■ If the instruction affects FPSCRpj, the new value of FPSCRxx 
is obtained by ORing the old value of FPSCRxx ^^^^ 

value of FPSCRpi. 

■ If the instruction does not affect FPSCRpj, the value of 
FPSCRxx is unchanged. 

7 Floating-Point Invalid Operation Exception (SNaN) (VXSNAN) 
See Section 4.4.1, "Invalid Operation Exception," on page 155. 

8 Floating-Point Invalid Operation Exception foo - oo j (VXISI) 
See Section 4.4.1, "Invalid Operation Exception," on page 155. 

9 Floating-Point Invalid Operation Exception foo -5- oo j (VXIDI) 
See Section 4.4.1, "Invalid Operation Exception," on page 155. 

10 Floating-Point Invalid Operation Exception (0^0) (VXZDZ) 
See Section 4.4.1, "Invalid Operation Exception," on page 155. 

11 Floating-Point Invalid Operation Exception (00 x 0) (VXIMZ) 
See Section 4.4.1, "Invalid Operation Exception," on page 155. 

12 Floating-Point Invalid Operation Exception (Invalid Compare) 

(VXVC) 

See Section 4.4.1, "Invalid Operation Exception," on page 155. 

13 Floating-Point Fraction Rounded (FR) 

The last Arithmetic or Rounding and Conversion instruction that 
rounded the intermediate result incremented the fraction. See Sec- 
tion 4.3.6, "Rounding," on page 149. This bit is not sticky. 

14 Floating-Point Fraction Inexact (FI) 

The last Arithmetic or Rounding and Conversion instruction either 
rounded the intermediate result (producing an inexact fraction) or 
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caused a disabled Overflow Exception. See Section 4.3.6, 
ing," on page 149. This bit is not sticky. 



'Round- 



See the definition of FPSCRxx? above, regarding the relationship 
betw^een FPSCRpi and FPSCRxx- 

15:19 Floating-Point Result Flags (FPRF) 

This field is set as described below. For arithmetic, rounding, and 
conversion instructions, the field is set based on the result placed 
into the target register, except that if any portion of the result is un- 
defined then the value placed into FPRF is undefined. 

1 5 Floating-Point Result Class Descriptor ( C ) 

Arithmetic, rounding, and conversion instructions may set this bit 
with the FPCC bits, to indicate the class of the result as shown in 
Figure 25 on page 140. 

16:19 Floating-Point Condition Code (FPCC) 

Floating-point Compare instructions set one of the FPCC bits to 1 
and the other three FPCC bits to 0. Arithmetic, rounding, and con- 
version instructions may set the FPCC bits with the C bit, to indi- 
cate the class of the result as shown in Figure 25 on page 140. Note 
that in this case the high-order three bits of the FPCC retain their 
relational significance indicating that the value is less than, greater 
than, or equal to zero. 

16 Floating-Point Less Than or Negative (FL or <) 

17 Floating-Point Greater Than or Positive (FG or >) 

1 8 Floating-Point Equal or Zero (FE or =) 

19 Floating-Point Unordered orNaN (FU or ?) 

20 Reserved 

21 Floating-Point Invalid Operation Exception (Software Request) 
(VXSOFT) 

This bit can be altered only by mcrfs, mtfsfi, mtfsf, mtfshO, or 
mtfsbl. See Section 4.4.1, "Invalid Operation Exception," on 
page 155. 

22 Floating-Point Invalid Operation Exception (Invalid Square 
Root) (VXSQRT) 

See Section 4.4.1, "Invalid Operation Exception," on page 155. 

23 Floating-Point Invalid Operation Exception (Invalid Integer 
Convert) (VXCVI) 

See Section 4.4.1, "Invalid Operation Exception," on page 155. 



Programming Note 

If the implementation 
does not support the 
Floating Square Root 
instruction or the 
Floating Reciprocal 
Square Root Estimate 
instruction, software can 
simulate the instruction 
and set FPSCRvxsqrt to 
reflect the exception. 
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Figure 25. Floating-Point Result Flags 

24 Floating-Point Invalid Operation Exception Enable ( VE) 

See Section 4.4.1, "Invalid Operation Exception," on page 155. 

25 Floating-Point Overflow Exception Enable (OE) 

See Section 4.4.3, "Overflow Exception," on page 159. 

26 Floating-Point Underflow Exception Enable (UE) 

See Section 4.4.4, "Underflov\^ Exception," on page 160. 

27 Floating-Point Zero Divide Exception Enable (ZE) 

See Section 4.4.2, "Zero Divide Exception," on page 158. 

28 Floating-Point Inexact Exception Enable (XE) 

See Section 4.4.5, "Inexact Exception," on page 162. 

29 Floating-Point Non-IEEE Mode (NI) 

If this bit is set to 1, the remaining FPSCR bits may have meanings 
other than those given in this document, and the results of floating- 
point operations need not conform to the IEEE standard. If the 
IEEE-conforming result of a floating-point operation would be a 
denormalized number, the result of that operation is 0 (with the 
same sign as the denormalized number) if FPSCRni=1 and other 
requirements specified in the Book IV, PowerPC Implementation 
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Features for the implementation are met. The other effects of set- 
ting this bit to 1 are described in Book IV and may differ between 
implementations. 

30;3 1 Floating-Point Rounding Control (RN) 

See Section 4.3.6, "Rounding," on page 149. 

00 Round to Nearest 

01 Round toward Zero 

10 Round toward +Infinity 

1 1 Round toward -Infinity 



4.3 Floating-Point Data 



4.3.1 Data Format 

This architecture defines the representation of a floating-point value in 
two different binary fixed-length formats. The format may be a 32-bit 
single format for a single-precision value or a 64-bit double format for a 
double-precision value. The single format may be used for data in stor- 
age. The double format may be used for data in storage and for data in 
floating-point registers. 

The lengths of the exponent and the fraction fields differ between these 
two formats. The structure of the single and double formats is shown 
below: 



S EXP 



FRACTION 



0 19 31 



Figure 26. Floating-point single format 
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Figure 27. Floating-point double format 
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Figure 28. IEEE floating-point fields 

Values in floating-point format are composed of three fields: 
S sign bit 

EXP exponent+bias 
FRACTION fraction 

If only a portion of a floating-point data item in storage is accessed, 
such as with a load or store instruction for a byte or halfword (or word in 
the case of floating-point double format), the value affected will depend 
on whether the PowerPC system is operating with Big-Endian byte order 
(the default), or Little-Endian byte order. See Appendix D, "Little-Endian 
Byte Ordering," on page 233. 

Representation of numeric values in the floating-point formats consists 
of a sign bit (S), a biased exponent (EXP), and the fraction portion 
FRACTION of the significand. The significand consists of a leading 
implied bit concatenated on the right with the FRACTION. This leading 
implied bit is 1 for normaUzed numbers and 0 for denormalized numbers 
and is located in the unit bit position (i.e., the first bit to the left of the 
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binary point). Values representable within the two floating-point formats 
can be specified by the parameters listed in Figure 28 on page 142. 

The architecture requires that the FPRs of the Floating-Point Processor 
support the floating-point double format only. 

4.3.2 Value Representation 

This architecture defines numeric and nonnumeric values representable 
within each of the two supported formats. The numeric values are 
approximations to the real numbers and include the normalized numbers, 
denormalized numbers, and zero values. The nonnumeric values repre- 
sentable are the infinities and the Not a Numbers (NaNs). The infinities 
are adjoined to the real numbers, but are not numbers themselves, and 
the standard rules of arithmetic do not hold when they are used in an 
operation. They are related to the real numbers by order alone. It is possi- 
ble, however, to define restricted operations among numbers and infini- 
ties as defined below. The relative location on the real number line for 
each of the defined entities is shown in Figure 29. 
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+INF 
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Figure 29. Approximation to real numbers 

The NaNs are not related to the numeric values or infinities by order 
or value but are encodings used to convey diagnostic information such as 
the representation of uninitialized variables. 

The following is a description of the different floating-point values 
defined in the architecture: 

Binary floating-point numbers 

Machine representable values used as approximations to real numbers. 
Three categories of numbers are supported: normalized numbers, denor- 
malized numbers, and zero values. 

Normalized numbers (± NOR) 

These are values that have a biased exponent value in the range: 
1 to 254 in single format 
1 to 2046 in double format 
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They are values in which the implied unit bit is 1. Normalized num- 
bers are interpreted as follows: 

JNUK = (-1)=* X 2^ X (1. fraction) 

where s is the sign, E is the unbiased exponent, and 1. fraction is the sig- 
nificand, which is composed of a leading unit bit (implied bit) and a frac- 
tion part. 

The ranges covered by the magnitude (M) of a normaHzed floating- 
point number are approximately equal to: 

Single Format: 

1.2x10"^^ <M< 3.4x10^^ 

Double Format: 

2.2x10"^^^ < M < 1.8x10^^^ 

Zero values (± 0) 

These are values that have a biased exponent value of zero and a fraction 
value of zero. Zeros can have a positive or negative sign. The sign of zero 
is ignored by comparison operations (i.e., comparison regards +0 as equal 
to -0). 

Denormalized numbers (± DEN) 

These are values that have a biased exponent value of zero and a nonzero 
fraction value. They are nonzero numbers smaller in magnitude than the 
representable normalized numbers. They are values in which the implied 
unit bit is 0. Denormalized numbers are interpreted as follows: 

DEN = (-1)^ X 2^"^^^ X (O.fraction) 

where Emin is the minimum representable exponent value (-126 for sin- 
gle-precision, -1022 for double-precision). 

Infinities (± oo ) 

These are values that have the maximum biased exponent value: 
255 in the single format 
2047 in the double format 

and a zero fraction value. They are used to approximate values greater in 
magnitude than the maximum normalized value. 

Infinity arithmetic is defined as the limiting case of real arithmetic, 
with restricted operations defined among numbers and infinities. Infinities 
and the real numbers can be related by ordering in the affine sense: 
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-oo < every finite number < +00 

Arithmetic on infinities is always exact and does not signal any excep- 
tion, except when an exception occurs due to the invalid operations as 
described in Section 4.4.1, "Invalid Operation Exception," on page 155. 

Not a Numbers (NaNs) 

These are values that have the maximum biased exponent value and a 
nonzero fraction value. The sign bit is ignored (i.e., NaNs are neither pos- 
itive nor negative). If the high-order bit of the fraction field is 0 then the 
NaN is a Signalling NaN; otherwise it is a Quiet NaN. 

Signalling NaNs are used to signal exceptions when they appear as 
operands of computational instructions. 

Quiet NaNs are used to represent the results of certain invalid opera- 
tions, such as invalid arithmetic operations on infinities or on NaNs, 
when Invalid Operation Exception is disabled (FPSCRy£=0). Quiet NaNs 
propagate through all floating-point operations except ordered compari- 
son. Floating Round to Single-Precision, and conversion to integer. Quiet 
NaNs do not signal exceptions, except for ordered comparison and con- 
version to integer operations. Specific encodings in QNaNs can thus be 
preserved through a sequence of floating-point operations, and used to 
convey diagnostic information to help identify results from invalid opera- 
tions. 

When a QNaN is the result of a floating-point operation because one 
of the operands is a NaN or because a QNaN was generated due to a dis- 
abled Invalid Operation Exception, then the following rule is applied to 
determine the NaN with the high-order fraction bit set to 1 that is to be 
stored as the result. 

if (FRA) is a NaN 
then FRT ^ (FRA) 
else if (FRB) is a NaN 
then if instruction is frsp 

then FRT <- (FRB)o:34 || ^^0 
else FRT ^ (FRB) 
else if (FRC) is a NaN 
then FRT <- (FRC) 
else if generated QNaN 

then FRT <- generated QNaN 

If the operand specified by FRA is a NaN, then that NaN is stored as 
the result. Otherwise, if the operand specified by FRB is a NaN (if the 
instruction specifies an FRB operand), then that NaN is stored as the 
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result, with the low-order 29 bits of the result set to 0 if the instruction is 
frsp. Otherwise, if the operand specified by FRC is a NaN (if the instruc- 
tion specifies ptn FRO operand), then that NaN is aluicu at) che result. 
Otherwise, if a QNaN was generated due to a disabled Invalid Operation 
Exception, then that QNaN is stored as the result. If a QNaN is to be 
generated as a result, then the QNaN generated has a sign bit of 0, an 
exponent field of all Is, and a high-order fraction bit of 1 with all other 
fraction bits 0. Any instruction that generates a QNaN as the result of a 
disabled Invalid Operation must generate this QNaN (i.e., 
0x7FF8_0000_0000_0000). 

A double-precision NaN is considered to be representable in single 
format if and only if the low-order 29 bits of the double-precision NaN's 
fraction are zero. 

4.3.3 Sign of Result 

The following rules govern the sign of the result of an arithmetic, round- 
ing, or conversion operation, when the operation does not yield an excep- 
tion. They apply even when the operands or results are zeros or infinities. 

■ The sign of the result of an add operation is the sign of the operand 
having the larger absolute value. If both operands have the same sign, 
the sign of the result of an add operation is the same as the sign of the 
operands. The sign of the result of the subtract operation x-y is the 
same as the sign of the result of the add operation x+(-y). 

When the sum of two operands with opposite sign, or the difference of 
two operands with the same sign, is exactly zero, the sign of the result 
is positive in all rounding modes except Round toward -Infinity, in 
which mode the sign is negative. 

■ The sign of the result of a multiply or divide operation is the Exclusive 
OR of the signs of the operands. 

■ The sign of the result of a Square Root or Reciprocal Square Root 
Estimate operation is always positive, except that the square root of 
-0 is -0 and the reciprocal square root of -0 is -Infinity. 

■ The sign of the result of a Round to Single-Precision or Convert To/ 
From Integer operation is the sign of the operand being converted. 

For the Multiply-Add instructions, the rules given above are applied 
first to the multiply operation and then to the add or subtract operation 
(one of the inputs to the add or subtract operation is the result of the 
multiply operation). 
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4.3.4 Normalization and Denormaiization 

The intermediate result of an arithmetic or frsp instruction may require 
normaUzation and/or denormaUzation as described below. Normalization 
and denormaiization do not affect the sign of the result. 

When an arithmetic or frsp instruction produces an intermediate 
result, consisting of a sign bit, an exponent, and a nonzero significand 
with a 0 leading bit, it is not a normalized number and must be normal- 
ized before it is stored. 

A number is normalized by shifting its significand left while decre- 
menting its exponent by 1 for each bit shifted, until the leading signifi- 
cand bit becomes 1. The Guard bit and the Round bit (see Section 4.5.1, 
"Execution Model for IEEE Operations," on page 163) participate in the 
shift with zeros shifted into the Round bit. The exponent is regarded as if 
its range were unlimited. 

After normalization, or if normalization was not required, the inter- 
mediate result may have a nonzero significand and an exponent value 
that is less than the minimum value that can be represented in the format 
specified for the result. In this case, the intermediate result is said to be 
"Tiny" and the stored result is determined by the rules described in Sec- 
tion 4.4.4, "Underflow Exception," on page 160. These rules may require 
denormaiization. 

A number is denormalized by shifting its significand right while incre- 
menting its exponent by 1 for each bit shifted, until the exponent is equal 
to the format's minimum value. If any significant bits are lost in this shift- 
ing process then "Loss of Accuracy" has occurred (see Section 4.4.4, 
"Underflow Exception," on page 160) and Underflow Exception is 
signaled. 



4.3.5 Data IHandiing and Precision 

Instructions are defined to move floating-point data between the FPRs 
and storage. For double format data, the data are not altered during the 
move. For single format data, a format conversion from single to double 
is performed when loading from storage into an FPR and a format con- 
version from double to single is performed when storing from an FPR to 
storage. No floating-point exceptions are caused by these instructions. 

All computational. Move, and Select instructions use the floating-point 
double format. 

Floating-point single-precision is obtained with the implementation of 
four types of instruction. 



Programming Note 

A single-precision value 
can be used in double- 
precision arithmetic 
operations. The reverse 
is true only if the double- 
precision value is 
representable in single 
format. 

Some implementations 
may execute single- 
precision arithmetic 
instructions faster than 
double-precision 
arithmetic instructions. 
Therefore, if double- 
precision accuracy is not 
required, single- 
precision data and 
instructions should be 
used. 
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Programming Note 

The Floating Round to 
Single-Precision 
instruction is provided to 
allow value conversion 
from double-precision to 
single-precision with 
appropriate exception 
checking and rounding. 
This instruction should be 
used to convert double- 
precision floating-point 
values (produced by 
double-precision load 
and arithmetic 
instructions and by fcfid) 
to single-precision values 
prior to storing them into 
single format storage 
elements or using them 
as operands for single- 
precision arithmetic 
instructions. Values 
produced by single- 
precision load and 
arithmetic instructions 
are already single- 
precision values and can 
be stored directly into 
single format storage 
elements, or used directly 
as operands for single- 
precision arithmetic 
instructions, without 
preceding the store, or 
the arithmetic 
instruction, by a Floating 
Round to Single- 
Precision i nst ruct i on . 



1 . Load Floating-Point Single 

This form of instruction accesses a single-precision operand in single 
formal ill ^luiagc, converts it to double format, and loads it into an 
FPR. No floating-point exceptions are caused by these instructions. 

2. Round to Floating-Point Single-Precision 

The Floating Round to Single-Precision instruction rounds a double- 
precision operand to single-precision if the operand is not already in 
single-precision range, checking the exponent for single-precision 
range and handling any exceptions according to respective enable bits, 
and places that operand into an FPR as a double-precision operand. 
For results produced by single-precision arithmetic instructions, single- 
precision loads, and other instances of the Floating Round to Single- 
Precision instruction, this operation does not alter the value. 

3. Single-Precision Arithmetic Instructions 

This form of instruction takes operands from the FPRs in double for- 
mat, performs the operation as if it produced an intermediate result 
correct to infinite precision and with unbounded range, and then 
coerces this intermediate result to fit in single format. Status bits, in 
the FPSCR and optionally in the Condition Register, are set to reflect 
the single-precision result. The result is then converted to double for- 
mat and placed into an FPR. The result lies in the range supported by 
the single format. 

All input values must be representable in single format: if they are not, 
the result placed into the target FPR, and the setting of status bits in 
the FPSCR and in the Condition Register (if Rc=l), are undefined. 

4. Store Floating-Point Single 

This form of instruction converts a double-precision operand to single 
format and stores that operand into storage. No floating-point excep- 
tions are caused by these instructions. (The value being stored is effec- 
tively assumed to be the result of an instruction of one of the preceding 
three types.) 

When the result of a Load Floating-Point Single, Floating Round to 
Single-Precision, or single-precision arithmetic instruction is stored in an 
FPR, the low-order 29 FRACTION bits are zero. 
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4.3.6 Rounding 

With the exception of the two optional Estimate instructions, Floating 
Reciprocal Estimate Single and Floating Reciprocal Square Root Esti- 
mate, all arithmetic, rounding, and conversion instructions defined by 
this architecture produce an intermediate result that can be regarded as 
being infinitely precise. This result must then be written with a precision 
of finite length into an FPR. After normalization or denormalization, if 
the infinitely precise intermediate result is not representable in the preci- 
sion required by the instruction then it is rounded before being placed 
into the target FPR. 

The instructions that may round their result are the Arithmetic and 
Rounding and Conversion instructions. For a given instance of one of 
these instructions, whether rounding actually occurs depends on the val- 
ues of the inputs. Each of these instructions sets FPSCR bits FR and FI 
according to whether rounding occurred (FI) and whether the fraction 
was incremented (FR). If rounding occurred, FI is set to 1, and FR may be 
set to either 0 or 1. If rounding did not occur, both FR and FI are set to 0. 

The two Estimate instructions set FR and FI to undefined values. The 
remaining floating-point instructions do not alter FR and FI. 

Four user-selectable rounding modes are provided through the Float- 
ing-Point Rounding Control field in the FPSCR. See Section 4.2.2, "Float- 
ing-Point Status and Control Register," on page 137. These are encoded 
as follows; 

RN Rounding Mode 



00 Round to Nearest 

01 Round toward Zero 

10 Round toward +Infinity 

1 1 Round toward -Infinity 

Let Z be the infinitely precise intermediate arithmetic result or the 
operand of a convert operation. If Z can be represented exactly in the tar- 
get format, then no rounding occurs, and the result in all rounding modes 
is equivalent to truncation of Z. If Z cannot be represented exactly in the 
target format, let Zl and Z2 bound Z as the next larger and next smaller 
numbers representable in the target format. Then Zl or Z2 can be used 
to approximate the result in the target format. 

Figure 30 on page 150 shows the relation of Z, Zl, and Z2 in this 
case. The following rules specify the rounding in the four modes. "LSB" 
means "least significant bit." 
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By Incrementing LSB of Z 
Infinitely Precise Value 
By Truncating after LSB 



Z2 Z Zl 0 Z2 Z Zl 

Negative values 1 ^ Positive values 



Figure 30. Selection of Z1 and Z2 



Round to Nearest 

Choose the best approximation (Zl or Z2). In case of a tie, choose the 
one that is even (least significant bit 0). 

Round toward Zero 

Choose the smaller in magnitude (Zl or Z2). 

Round toward +Infinity 
Choose Zl. 



Round toward -Infinity 
Choose Z2. 



See Section 4.5.1, "Execution Model for IEEE Operations," on 
page 163 for a detailed explanation of rounding. 

An Overflow Exception or an Underflow Exception may occur, as 
described in Section 4.4.3, "Overflow Exception," on page 159, and Sec- 
tion 4.4.4, "Underflow Exception," on page 160. 



4.4 Floating-Point Exceptions 

This architecture defines the following floating-point exceptions: 

■ Invalid Operation Exception 
SNaN 

Infinity-Infinity 
Infinity-J-Infinity 
ZeroH-Zero 
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InfinityxZero 
Invalid Compare 
Software Request 
Invalid Square Root 
Invalid Integer Convert 

■ Zero Divide Exception 

■ Overflov^^ Exception 

■ Underflow Exception 

■ Inexact Exception 

These exceptions may occur during execution of computational 
instructions. In addition, an Invalid Operation Exception occurs when a 
Floating-Point Status and Control Register instruction sets FPSCRyx- 
SOFT to 1 (Software Request). An Invalid Square Root exception can 
occur only if at least one of the Floating Square Root instructions defined 
in Appendix A, "Optional Instructions," on page 197, is implemented. 

Each floating-point exception, and each category of Invalid Operation 
Exception, has an exception bit in the FPSCR. In addition, each floating- 
point exception has a corresponding enable bit in the FPSCR. The excep- 
tion bit indicates occurrence of the corresponding exception. If an excep- 
tion occurs, the corresponding enable bit governs the result produced by 
the instruction and, in conjunction with the FEO and FEl bits (see page 
153), whether and how the system floating-point enabled exception error 
handler is invoked. (In general, the enabling specified by the enable bit is 
one of invoking the system error handler, not of permitting the exception 
to occur. The occurrence of an exception depends only on the instruction 
and its inputs, not on the setting of any control bits. The only deviation 
from this general rule is that the occurrence of an Underflow Exception 
may depend on the setting of the enable bit.) 

The Floating-Point Exception Summary bit (FX) in the FPSCR is set to 
1 by any floating-point instruction, except mtfsfi and mtfsf, that causes 
any of the floating-point exception bits in the FPSCR to change from 0 to 
1, or by a mtfsfi^ mtfsfi or mtfsbl instruction that explicitly sets the bit to 
1. The Floating-Point Enabled Exception Summary bit (FEX) in the 
FPSCR is set when any of the exceptions is set and the exception is 
enabled (enable bit is 1). 

A single instruction, other than mtfsfi or mtfsfi may set more than one 
exception only in the following cases: 
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■ Inexact Exception may be set with Overflow Exception. 

■ Inexact Exception may be set with Underflow F.vrpni-ion. 

■ Invalid Operation Exception (SNaN) is set with InvaHd Operation 
Exception (oo x 0) for Multiply-Add instructions for which the values 
being multiplied are infinity and zero and the value being added is an 
SNaN. 

■ Invalid Operation Exception (SNaN) may be set with Invalid Opera- 
tion Exception (Invalid Compare) for Compare Ordered instructions. 

■ Invalid Operation Exception (SNaN) may be set with Invalid Opera- 
tion Exception (Invalid Integer Convert) for Convert to Integer 
instructions. 

When an exception occurs the instruction execution may be sup- 
pressed or a result may be delivered, depending on the exception. 

Instruction execution is suppressed for the following kinds of excep- 
tion, so that there is no possibility that one of the operands is lost. 

■ Enabled Invalid Operation 

■ Enabled Zero Divide 

For the remaining kinds of exception, a result is generated and written 
to the destination specified by the instruction causing the exception. The 
result may be a different value for the enabled and disabled conditions for 
some of these exceptions. The kinds of exception that deliver a result are 
the following. 

■ Disabled InvaHd Operation 

■ Disabled Zero Divide 

■ Disabled Overflow 

■ Disabled Underflow 

■ Disabled Inexact 

■ Enabled Overflow 

■ Enabled Underflow 

■ Enabled Inexact 

Subsequent sections define each of the floating-point exceptions and 
specify the action that is taken when they are detected. 
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The IEEE standard specifies the handhng of exceptional conditions in 
terms of "traps" and "trap handlers." In this architecture, an FPSCR 
exception enable bit of 1 causes generation of the result value specified in 
the IEEE standard for the "trap enabled" case: the expectation is that the 
exception will be detected by software, which will revise the result. An 
FPSCR exception enable bit of 0 causes generation of the "default result" 
value specified for the "trap disabled" (or "no trap occurs" or "trap is 
not implemented") case: the expectation is that the exception will not be 
detected by software, which will simply use the default result. The result 
to be delivered in each case for each exception is described in the sections 
below. 

The IEEE default behavior when an exception occurs is to generate a 
default value and not to notify software. In this architecture, if the IEEE 
default behavior when an exception occurs is desired for all exceptions, 
all FPSCR exception enable bits should be set to 0 and Ignore Exceptions 
Mode (see below) should be used. In this case the system floating-point 
enabled exception error handler is not invoked, even if floating-point 
exceptions occur: software can inspect the FPSCR exception bits if neces- 
sary, to determine whether exceptions have occurred. 

In this architecture, if software is to be notified that a given kind of 
exception has occurred, the corresponding FPSCR exception enable bit 
must be set to 1 and a mode other than Ignore Exceptions Mode must be 
used. In this case the system floating-point enabled exception error han- 
dler is invoked if an enabled floating-point exception occurs. 

The FEO and FEl bits control whether and how the system floating- 
point enabled exception error handler is invoked if an enabled floating- 
point exception occurs. The location of these bits and the requirements 
for altering them are described in Book III, Section 2.2.3, "Machine State 
Register," on page 374 and Chapter 7, "Synchronization Requirements 
for Special Registers and for Lookaside Buffers," on page 483. (The sys- 
tem floating-point enabled exception error handler is never invoked 
because of a disabled floating-point exception.) The effects of the four 
possible settings of these bits are as follows. 

FEO FEl Description 



0 0 Ignore Exceptions Mode 

Floating-point exceptions do not cause the system floating- 
point enabled exception error handler to be invoked. 

0 1 Imprecise Nonrecoverable Mode 

The system floating-point enabled exception error handler is 
invoked at some point at or beyond the instruction that caused 
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Programming Note 

In any of the three non- 
Precise modes, a Flnatinn- 
Point Status and Control 
Register Instruction can 
be used to force any 
exceptions, due to 
instructions initiated 
before the Floating- 
Point Status and Control 
Register instruction, to be 
recorded in the FPSCR. 
(This forcing is 
superfluous for Precise 
Mode.) 

In either of the Imprecise 
modes, a Floating-Point 
Status and Control 
Register instruction can 
be used to force any 
invocations of the system 
floating-point enabled 
exception error handler, 
due to instructions 
initiated before the 
Floating-Point Status and 
Control Register 
instruction, to occur. (This 
forcing has no effect in 
Ignore Exceptions Mode 
and is superfluous for 
Precise Mode.) 

A sync instruction, or any 
other execution 
synchronizing instruction 
or event (e.g., isync: see 
Book II, "Instruction 
Synchronize XL-form," 
on page 346), also has 
the effects described 
above. However, in order 
to obtain the best 
performance across the 
widest range of 
implementations, a 
Floating-Point Status and 
Control Register 
instruction should be 
used to obtain these 
effects. 



the enabled exception. It may not be possible to identify the ex- 
cepting instruction or the data that caused the exception. Re- 
sults prouuccu by the excepting instruction may have been used 
by or may have affected subsequent instructions that are exe- 
cuted before the error handler is invoked. 

1 0 Imprecise Recoverable Mode 

The system floating-point enabled exception error handler is 
invoked at some point at or beyond the instruction that caused 
the enabled exception. Sufficient information is provided to the 
error handler that it can identify the excepting instruction and 
the operands, and correct the result. No results produced by 
the excepting instruction have been used by or have affected 
subsequent instructions that are executed before the error han- 
dler is invoked. 

1 1 Precise Mode 

The system floating-point enabled exception error handler is 
invoked precisely at the instruction that caused the enabled ex- 
ception. 

In all cases, the question of whether a floating-point result is stored, 
and what value is stored, is governed by the FPSCR exception enable bits, 
as described in subsequent sections, and is not affected by the value of the 
FEO and FEl bits. 

In all cases in which the system floating-point enabled exception error 
handler is invoked, all instructions before the instruction at which the 
system floating-point enabled exception error handler is invoked have 
completed, and no instruction after the instruction at which the system 
floating-point enabled exception error handler is invoked has been exe- 
cuted. (Recall that, for the two Imprecise modes, the instruction at which 
the system floating-point enabled exception error handler is invoked need 
not be the instruction that caused the exception.) The instruction at 
which the system floating-point enabled exception error handler is 
invoked has not been executed unless it is the excepting instruction, in 
which case it has been executed if the exception is not among those listed 
on page 152 as suppressed. 

In order to obtain the best performance across the widest range of 
implementations, the programmer should obey the following guidelines. 

■ If the IEEE default results are acceptable to the application. Ignore 
Exceptions Mode should be used with all FPSCR exception enable bits 
set to 0. 
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■ If the IEEE default results are not acceptable to the application, Impre- 
cise Nonrecoverable Mode should be used, or Imprecise Recoverable 
Mode if recoverability is needed, with FPSCR exception enable bits set 
to 1 for those exceptions for which the system floating-point enabled 
exception error handler is to be invoked. 

■ Ignore Exceptions Mode should not, in general, be used when any 
FPSCR exception enable bits are set to 1. 

■ Precise Mode may degrade performance in some implementations, 
perhaps substantially, and therefore should be used only for debugging 
and other specialized applications. 



4.4.1 Invalid Operation Exception 



Definition 

An Invalid Operation Exception occurs whenever an operand is invalid 
for the specified operation. The invalid operations are: 

■ Any floating-point operation on a signaling NaN (SNaN). 

■ For add or subtract operations, magnitude subtraction of infinities 

■ Division of infinity by infinity (oo ^ oo ) 

■ Division of zero by zero (0-^0) 

■ Multiplication of infinity by zero (oo x 0) 

■ Ordered comparison involving a NaN (Invalid Compare) 

■ Square root or reciprocal square root of a negative (and nonzero) 
number (Invalid Square Root) 

■ Integer convert involving a large number, an infinity, or a NaN 
(Invalid Integer Convert) 

In addition, an Invalid Operation Exception occurs if software explic- 
itly requests this by executing an mtfsfi, mtfsf, or mtfsbl instruction that 
sets FPSCRyxsoFT ^ (Software Request). An Invalid Square Root 
exception can occur only if at least one of the Floating Square Root 
instructions defined in Appendix A, "Optional Instructions," on 
page 197, is implemented. 



Programming Note 

The purpose of 
FPSCRvxsoFT is to allow 
software to cause an 
Invalid Operation 
Exception for a condition 
that is not necessarily 
associated with the 
execution of a floating- 
point instruction. For 
example, it might be set 
by a program that 
computes a square root, 
if the source operand is 
negative. 
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Action 

TVip action to be taken depends on the setting of the Invalid Operation 
Exception Enable bit of the FPSCR. 

When Invalid Operation Exception is enabled (FPSCRve=1) and 
Invalid Operation occurs or software explicitly requests the exception, 
then the following actions are taken: 

1 . One or two Invalid Operation Exceptions are set 



FPSCRvxSNAN 


(if SNaN) 


FPSCRvxiSI 


(if oo - oo ) 


FPSCRvxiDI 


(if oo oo ) 


FPSCRyxZDZ 


(if 0-^0) 


FPSCRvxiMZ 


(ifoo XO) 


FPSCRvxvc 


(if invalid compare) 


FPSCRvxsOFT 


(if software request) 


FPSCRvxSQRT 


(if invalid square root) 


FPSCRvxcvi 


(if invalid integer convert) 



2. If the operation is an arithmetic, Floating Round to Single-Precision, 
or convert to integer operation, 

the target FPR is unchanged 

FPSCRpR PI are set to zero 

FPSCRppRF is unchanged 

3. If the operation is a compare, 

FPSCRpR PI Q are unchanged 
FPSCRpp^/^ is set to reflect unordered 

4. If software explicitly requests the exception, 

FPSCRpR PI ppRp are as set by the mffsfi, mtfsf, or mtfsbl instruc- 
tion 

When Invalid Operation Exception is disabled (FPSCRye=0) and 
Invalid Operation occurs or software explicitly requests the exception, 
then the following actions are taken: 
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1 . One or two Invalid Operation Exceptions are set 



^FbCKvXSNAN 


(it bJNaJN) 


ti bCKyxiSI 


I ll ^ — ^ ) 


i^PSCKyxiDI 


M J oo -j- CO 1 


^FbCKyxZDZ 


(if 0 -^ 0) 


FPSCRvxiMZ 


(if oo X 0) 


FPSCRvxvC 


(if invalid compare) 


FPSCRvxSOFT 


(if softw^are request) 


FPSCRyxSQRT 


(if invalid square root) 


FPSCRyxcVI 


(if invalid integer convert) 



2. If the operation is an arithmetic or Floating Round to Single-Precision 
operation, 

the target FPR is set to a Quiet NaN 
FPSCRpR PI are set to zero 

FPSCRppRF is set to indicate the class of the result (Quiet NaN) 

3. If the operation is a convert to 64-bit integer operation, 

the target FPR is set as foUow^s: 

FRT is set to the most positive 64- bit integer if the operand in 
FRB is a positive number or +oo , and to the most negative 64-bit 
integer if the operand in FRB is a negative number, -oo , or NaN 

FPSCRpR PI are set to zero 
FPSCRppRp is undefined 

4. If the operation is a convert to 32-bit integer operation, 

the target FPR is set as follows: 
FRT0.31 <r- undefined 

FRT32:63 are set to the most positive 32-bit integer if the oper- 
and in FRB is a positive number or +00 , and to the most negative 
32-bit integer if the operand in FRB is a negative number, -00 , or 
NaN 

FPSCRpR PI are set to zero 
FPSCRppRp is undefined 
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5. If the operation is a compare, 

FPSCRpR Fi Q are unchanged 
FPSCRppcc is set to reflect unordered 

6. If software expHcitly requests the exception, 

FPSCRpR PI ppRp are as set by the mtfsfi, mtfsf, or mtfsbl instruc- 
tion 

4.4.2 Zero Divide Exception 
Definition 

A Zero Divide Exception occurs when a Divide instruction is executed 
with a zero divisor value and a finite nonzero dividend value. It also 
occurs when a Reciprocal Estimate instruction (fres or frsqrte) is exe- 
cuted with an operand value of zero. 

Action 

The action to be taken depends on the setting of the Zero Divide Excep- 
tion Enable bit of the FPSCR. 

When Zero Divide Exception is enabled (FPSCR2e=1) ^i^d Zero 
Divide occurs, then the following actions are taken: 

1. Zero Divide Exception is set 

FPSCRzx ^ 1 

2. The target FPR is unchanged 

3. FPSCRpR PI are set to zero 

4. FPSCRppRp is unchanged 

When Zero Divide Exception is disabled (FPSCR2e=0) and Zero 
Divide occurs, then the following actions are taken: 

1 - Zero Divide Exception is set 

FPSCRzx ^ 1 

2. The target FPR is set to ± Infinity, where the sign is determined by the 
XOR of the signs of the operands 
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3. FPSCRpR Fi are set to zero 

4. FPSCRppRF is set to indicate the class and sign of the result (± Infinity) 

4.4.3 Overflow Exception 
Definition 

Overflow occurs when the magnitude of what would have been the 
rounded result if the exponent range were unbounded exceeds that of the 
largest finite number of the specified result precision. 

Action 

The action to be taken depends on the setting of the Overflow Exception 
Enable bit of the FPSCR. 

When Overflow Exception is enabled (FPSCRoe=1) exponent 
overflow occurs, then the following actions are taken: 

1 - Overflow Exception is set 

FPSCRox ^ 1 

2. For double-precision arithmetic instructions, the exponent of the nor- 
malized intermediate result is adjusted by subtracting 1536 

3. For single-precision arithmetic instructions and the Floating Round to 
Single-Precision instruction, the exponent of the normalized interme- 
diate result is adjusted by subtracting 192 

4. The adjusted rounded result is placed into the target FPR 

5. FPSCRppRF is set to indicate the class and sign of the result (± Normal 
Number) 

When Overflow Exception is disabled (FPSCRoe=0) overflow 
occurs, then the following actions are taken; 

1 . Overflow Exception is set 

FPSCRox <- 1 

2. Inexact Exception is set 

FPSCRxx 1 
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3. The result is determined by the rounding mode (FPSCRrn) and the 
sign of the intermediate result as follows: 

A- Round ro Nearest 

Store ± Infinity, where the sign is the sign of the intermediate result 

B. Round toward Zero 

Store the format's largest finite number with the sign of the inter- 
mediate result 

C. Round toward +Infinity 

For negative overflow, store the format's most negative finite num- 
ber; for positive overflow, store +Infinity 

D. Round toward -Infinity 

For negative overflow, store -Infinity; for positive overflow, store 
the format's largest finite number 

4- The result is placed into the target FPR 

5. FPSCRpR is undefined 

6. FPSCRpi is set to 1 

7. FPSCRppRp is set to indicate the class and sign of the result (± Infinity 
or ± Normal Number) 

4.4.4 Underflow Exception 
Definition 

Underflow Exception is defined separately for the enabled and disabled 
states: 

■ Enabled: 

Underflow occurs when the intermediate result is "Tiny." 

■ Disabled: 

Underflow occurs when the intermediate result is "Tiny" and there is 
"Loss of Accuracy." 

A "Tiny" result is detected before rounding, when a nonzero result 
value computed as though the exponent range were unbounded would be 
less in magnitude than the smallest normalized number. 
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If the intermediate result is "Tiny" and Underflow Exception is dis- 
abled (FPSCRu£=0) then the intermediate result is denormalized (see Sec- 
tion 4.3.4, "Normalization and Denormalization," on page 147) and 
rounded (see Section 4.3.6, "Rounding," on page 149) before being 
placed into the target FPR. 

"Loss of Accuracy" is detected when the delivered result value differs 
from what would have been computed were both the exponent range and 
precision unbounded. 



Action 



The action to be taken depends on the setting of the Underflow Exception 
Enable bit of the FPSCR. 

When Underflow Exception is enabled (FPSCRu£=l) and exponent 
underflow occurs, then the following actions are taken: 

1 . Underflow Exception is set 

FPSCRux 1 

2. For double-precision arithmetic instructions, the exponent of the nor- 
malized intermediate result is adjusted by adding 1536 

3. For single-precision arithmetic instructions and the Floating Round to 
Single-Precision instruction, the exponent of the normalized interme- 
diate result is adjusted by adding 192 

4. The adjusted rounded result is placed into the target FPR 

5. FPSCRppRp is set to indicate the class and sign of the result (± Normal- 
ized Number) 



Programming Note 

The FR and Fl bits are 
provided to allow the 
system floating-point 
enabled exception error 
handler, when invoked 
because of an Underflow 
Exception, to simulate a 
"trap disabled" 
environment. That is, the 
FR and Fl bits allow the 
system floating-point 
enabled exception error 
handler to unround the 
result, thus allowing the 
result to be 
denormalized. 



When Underflow Exception is disabled (FPSCRue=0) ^nd underflow 
occurs, then the following actions are taken: 

1 . Underflow Exception is set 

FPSCRux ^ 1 

2. The rounded result is placed into the target FPR 

3. FPSCRppRp is set to indicate the class and sign of the result (± Denor- 
malized Number or ± Zero) 



Book I PowerPC User Instruction Set Architecture 



162 



Chapter 4 Floating-Point Processor 



Programming Note 

In some 

implementations, 
enabling Inexact 
Exceptions may degrade 
performance more than 
does enabling other 
types of floating-point 
exception. 



4.4.5 Inexact Exception 



Definition 

Inexact Exception occurs when one of two conditions occur during 
rounding: 

1 . The rounded result differs from the intermediate result, assuming the 
intermediate result exponent range and precision to be unbounded. 

2. The rounded result overflows and Overflow Exception is disabled. 



Action 

The action to be taken does not depend on the setting of the Inexact 
Exception Enable bit of the FPSCR. 

When Inexact Exception occurs, then the following actions are taken: 

1 . Inexact Exception is set 

FPSCRxx <- 1 

2. The rounded or overflowed result is placed into the target FPR 

3. FPSCRppRp is set to indicate the class and sign of the result 



4.5 Floating-Point Execution iViodeis 

All implementations of this architecture must provide the equivalent of 
the following execution models to ensure that identical results are 
obtained. 

Special rules are provided in the definition of the computational 
instructions for the infinities, denormalized numbers, and NaNs. 

Although the double format specifies an 11 -bit exponent, exponent 
arithmetic makes use of two additional bit positions to avoid potential 
transient overflow conditions. One extra bit is required when denormal- 
ized double-precision numbers are prenormalized. The second bit is 
required to permit the computation of the adjusted exponent value in the 
following cases when the corresponding exception enable bit is 1 : 

■ Underflow during multiplication using a denormalized operand 

■ Overflow during division using a denormalized divisor 
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The IEEE standard includes 32-bit and 64-bit arithmetic. The standard 
requires that single-precision arithmetic be provided for single-precision 
operands. The standard permits double-precision floating-point opera- 
tions to have either (or both) single-precision or double-precision oper- 
ands, but it states that single-precision floating-point operations should 
not accept double-precision operands. The PowerPC Architecture follows 
these guidelines: double-precision arithmetic instructions can have oper- 
ands of either or both precisions, while single-precision arithmetic 
instructions require all operands to be single-precision. Double-precision 
arithmetic instructions and fcfid produce double-precision values, while 
single-precision arithmetic instructions produce single-precision values. 

For arithmetic instructions, conversions from double-precision to sin- 
gle-precision must be done explicitly by software, while conversions from 
single-precision to double-precision are done implicitly. 

4.5.1 Execution Model for IEEE Operations 

The following description uses 64-bit arithmetic as an example. 32-bit 
arithmetic is similar except that the FRACTION is a 23-bit field, and the 
single-precision Guard, Round, and Sticky bits (described in this section) 
are logically adjacent to the 23-bit FRACTION field. 

IEEE-conforming significand arithmetic is considered to be performed 
with a floating-point accumulator having the following format: 



s 


c 


L 


FRACTION 


c 


R 




0 1 52 



Figure 31 . IEEE 64-bit execution model 



The S bit is the sign bit. 

The C bit is the carry bit that captures the carry out of the significand. 

The L bit is the leading unit bit of the significand that receives the 
implicit bit from the operand. 

The FRACTION is a 52-bit field that accepts the fraction of the oper- 
and. 

The Guard (G), Round (R), and Sticky (X) bits are extensions to the 
low-order bits of the accumulator. The G and R bits are required for 
postnormalization of the result. The G, R, and X bits are required during 
rounding to determine if the intermediate result is equally near the two 
nearest representable values. The X bit serves as an extension to the G 
and R bits by representing the logical OR of all bits that may appear to 
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the low-order side of the R bit, due either to shifting the accumulator 
right or to other generation of low-order result bits. The G and R bits 
participate in the left shifts vVith zeroa bcm^ sliiiicu into the R bit. 
Figure 32 shows the significance of the G, R, and X bits with respect to 
the intermediate result (IR), the representable number next lower in mag- 
nitude (NL), and the representable number next higher in magnitude 
(NH). 



G 


R 


X 


Interpretation 


0 


0 


0 


IR is exact 


0 


0 


1 




0 


1 


0 


IR closer to NL 


0 


1 


1 




1 


0 


0 


IR midway between NL and NH 


1 


0 


1 




1 


1 


0 


IR closer to NH 


1 


1 


1 





Figure 32. Interpretation of G, R, and X bits 



The significand of the intermediate result is made up of the L bit, the 
FRACTION, and the G,R, and X bits. 

The infinitely precise intermediate result of an operation is the result 
normalized in bits L, FRACTION, G, R, and X of the floating-point 
accumulator. 

Before the result is stored into an FPR, the significand is rounded if 
necessary, using the rounding mode specified by FPSCRr^. If rounding 
results in a carry into C, the significand is shifted right one position and 
the exponent incremented by one. This action yields an inexact result and 
possibly also exponent overflow. Fraction bits to the left of the bit posi- 
tion used for rounding are stored into the FPR and low-order bit posi- 
tions, if any, are set to zero. 

Four user-selectable rounding modes are provided through FPSCRrjsj 
as decribed in Section 4.3.6, "Rounding," on page 149. For rounding, the 
conceptual Guard, Round, and Sticky bits are defined in terms of accu- 
mulator bits. Figure 33 on page 165 shows the positions of the Guard, 
Round, and Sticky bits for double-precision and single-precision floating- 
point numbers. 
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Format 


Guard 


Round 


Sticky 


Double 


Gbit 


Kbit 


Xbit 


Single 


24 


25 


26:52, G,R,X 



Figure 33. Location of the Guard, Round, and Sticicy bits 



Rounding can be treated as though the significand were shifted right, if 
required, until the least significant bit to be retained is in the low-order 
bit position of the FRACTION. If any of the Guard, Round, or Sticky 
bits is nonzero, then the result is inexact. 

Zl and Z2, as defined on page 149, can be used to approximate the 
result in the target format when one of the following rules is used. 

■ Round to Nearest 
Guard bit = 0 

The result is truncated. (Result exact (GRX = 000) or closest to next 
lower value in magnitude (GRX = 001, 010, or Oil)) 

Guard bit = 1 

Depends on Round and Sticky bits: 
Case a 

If the Round or Sticky bit is 1 (inclusive), the result is incremented. 
(Result closest to next higher value in magitude (GRX = 101, 110, 
or 111)) 

Case b 

If the Round and Sticky bits are 0 (result midway between closest 
representable values), then if the low-order bit of the result is 1 the 
result is incremented. Otherwise (the low-order bit of the result is 
0) the result is truncated (this is the case of a tie rounded to even). 

If during the Round to Nearest process, truncation of the unrounded 
number would produce the maximum magnitude for the specified pre- 
cision, then the following action is taken: 

Guard bit = 0 

Store the truncated (maximum magnitude) value. 
Guard bit = 1 

Store infinity with the sign of the unrounded result. 
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■ Round toward Zero 

Choose the smaller in magnitude of Zl or Z2. If Guard, Round, or 
c^-i^U-rr u;^ ir. -^.u- 1- !- • 

■ Round toward +Infinity 
Choose Zl. 

■ Round toward -In&nity 
Choose Z2, 

Where the result is to have fewer than 53 bits of precision because the 
instruction is a Floating Round to Single-Precision or single-precision 
arithmetic instruction, the intermediate result either is normalized or is 
placed in correct denormalized form before any rounding is done. 

4.5.2 Execution Model for Multiply-Add Type 
Instructions 

The PowerPC Architecture makes use of a special form of instruction that 
performs up to three operations in one instruction (a multiplication, an 
addition, and a negation). With this added capability comes the special 
ability to produce a more exact intermediate result as an input to the 
rounder. 32-bit arithmetic is similar except that the FRACTION field is 
smaller. 

The multiply-add operations produce intermediate results conforming 
to the following model: 



s 


c 


L 


FRACTION 








0 


1 


105 





Figure 34. iViultiply-Add execution model 



The first part of the operation is a multiplication. The multiplication 
has two 53-bit significands as inputs, which are assumed to be prenor- 
malized, and produces a result conforming to the above model. If there is 
a carry out of the significand (into the C bit), then the significand is 
shifted right one position, shifting the L bit (leading unit bit) into the 
most significant bit of the FRACTION and shifting the C bit (carry out) 
into the L bit. All 106 bits (L bit, the FRACTION) of the product take 
part in the add operation. If the exponents of the two inputs to the adder 
are not equal, the significand of the operand with the smaller exponent is 
aligned (shifted) to the right by an amount that is added to that exponent 
to make it equal to the other input's exponent. Zeros are shifted into the 
left of the significand as it is aligned and bits shifted out of bit 105 of the 
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significand are ORed into the X' bit. The add operation also produces a 
resuh conforming to the above model with the X' bit taking part in the 
add operation. 

The result of the addition is then normalized, with all bits of the addi- 
tion result, except the X' bit, participating in the shift. The normalized 
result provides an intermediate result as input to the rounder that con- 
forms to the model described in Section 4.5.1, "Execution Model for 
IEEE Operations," on page 163, where: 

■ The Guard bit is bit 53 of the intermediate result. 

■ The Round bit is bit 54 of the intermediate result. 

■ The Sticky bit is the OR of all remaining bits to the right of bit 55, 
inclusive. 

The rules of rounding the intermediate result are the same as those 
given in Section 4.5.1, "Execution Model for IEEE Operations," on 
page 163. 

If the instruction is Floating Negative Multiply -Add or Floating Nega- 
tive Multiply-Subtract, the final result is negated. 
Status bits are set as follows: 

■ Overflow, Underflow, and Inexact Exception bits, the FR and FI bits, 
and the FPRF field are set based on the final result of the operation, 
and not on the result of the multiplication. 

■ Invalid Operation Exception bits are set as if the multiplication and 
the addition were performed using two separate instructions (fmul[s]^ 
followed by fadd[$] or fsub[s]). That is, multiplication of infinity by 0 
or of anything by an SNaN, and/or addition of an SNaN, cause the 
corresponding exception bits to be set. 



4.6 Floating-Point Processor 
instructions 



4.6.1 Floating-Point Storage Access 
Instructions 

The Storage Access instructions compute the effective address (EA) of the 
storage to be accessed as described in Section 1.11.2, "Effective Address 
Calculation," on page 29. 
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Programming Note 

The "la" extended 
mnemonic permits 
computing an effective 
address as a Load or Store 
instruction would, but 
loads the address itself 
into a GPR rather than 
loading the value that is 
in storage at that 
address. This extended 
mnemonic is described in 
"Load Address/' on 
page 232. 



The order of bytes accessed by floating-point loads and stores is Big- 
Endian, unless Little-Endian storage ordering is selected as described in 
Appendix D, "Little-Endian Byte Ordering," on page 233. 



Storage Access Exceptions 

Storage accesses will cause the system error handler to be invoked if the 
program is not allowed to modify the target storage (Store only), or if the 
program attempts to access storage that is unavailable. 



4.6.2 Floating-Point Load instructions 

There are two basic forms of load instruction: single-precision and dou- 
ble-precision. Because the FPRs support only floating-point double for- 
mat, single-precision Load Floating-Point instructions convert single- 
precision data to double format prior to loading the operands into the 
target FPR. The conversion and loading steps are as follows: 

Let WORDo:3i be the floating-point single-precision operand accessed 
from storage. 

Normalized Operand 

if WORDi.g > 0 and WORDi.g < 255 then 

FRTo:i ^ WORDo:i 

FKT2^ -.word/ 

FRT3 ^ -1WORD1 

FRT4 <- -.WORDi 

FRT5,63^WORD2:3i II ^^0 

Denormalized Operand 

if WORD^.g = 0 and WORD9.31 9^ 0 then 

sign t- WORDq 
exp < — 126 

fraco:52 <- ObO II WORD9.31 II ^^0 
normalize the operand 
do while fracQ = 0 

frac <r- fraci.52 II 

exp <— exp - 1 
FRTq <— sign 
FRTi.ii <- exp + 1023 
FRTi2;63 ^ fraci.52 
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Zero / Infinity / NaN 

if WORDi.g = 255 or WORD1.31 = 0 then 

FRTo:i <- WORDo:i 

FRT2<-WORDi 

FRT3 <- WORDi 

FRT4 <- WORDi 

FRT5,63<-WORD2:31 II ^^0 

For double-precision Load Floating-Point instructions, no conversion 
is required as the data from storage are copied directly into the FPR. 

Many of the Load Floating-Point instructions have an "update" form, 
in which register RA is updated with the effective address. For these 
forms, if RA^»iO, the effective address is placed into register RA and the 
storage element (word or doubleword) addressed by EA is loaded into 
FRT. 

Note: Recall that RA and RB denote general purpose registers, while 
FRT denotes a floating-point register. 

Byte order of PowerPC is Big-Endian by default; see Appendix D, "Lit- 
tle-Endian Byte Ordering," on page 233 for PowerPC systems operated 
with Little-Endian byte ordering. 



Load Floating-Point Single D-form 

Ifs FRT,D(RA) 



48 


FRT 


RA 




D 




0 


6 


11 


16 




31 



if RA = 0 then b <- 0 
else b <- (RA) 

EA «- b + EXTS(D) 
FRT <- DOUBLE(MEM(EA, 4)) 

Let the effective address (EA) be the sum (RAIO)+D. 

The word in storage addressed by EA is interpreted as a floating-point 
single-precision operand. This word is converted to floating-point double 
format (see page 168) and placed into register FRT. 

Special Registers Altered 

None 
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Load Floating-Point Single Indexed X-form 

Ifsx FRT,RA,RB 



31 


FRT 


RA 


RB 


535 


/ 


0 


6 


11 


16 


21 


31 



1f RA = 0 then b <- 0 
else b <r- (RA) 

EA ^ b + (RB) 
FRT <r- DOUBLE(MEM(EA, 4)) 

Let the effective address (EA) be the sum (RAIO)+(RB). 

The word in storage addressed by EA is interpreted as a floating-point 
single-precision operand. This word is converted to floating-point double 
format (see page 168) and placed into register FRT. 

Special Registers Altered 

None 



Load Floating-Point Single with Update D-form 

Ifsu FRT,D(RA) 



49 


FRT 


RA 




D 




0 


6 


11 


16 




31 



EA 4- (RA) + EXTS(D) 

FRT <- DOUBLE(MEM(EA, 4)) 

RA EA 

Let the effective address (EA) be the sum (RA)+D. 

The word in storage addressed by EA is interpreted as a floating-point 
single-precision operand. This word is converted to floating-point double 
format (see page 168) and placed into register FRT. 

EA is placed into register RA. 

If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 
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Load Floating-Point Single with Update indexed X-form 

Ifsux FRT,RA,RB 



31 


FRT 


RA 


RB 


567 


/ 


0 


6 


11 


16 


21 


31 



EA <- (RA) + (RB) 

FRT ^ DOUBLE(MEM(EA, 4)) 

RA ^ EA 

Let the effective address (EA) be the sum (RA)+(RB). 

The word in storage addressed by EA is interpreted as a floating-point 
single-precision operand. This word is converted to floating-point double 
format (see page 168) and placed into register FRT. 

EA is placed into register RA. 

If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 



Load Floating-Point Double D-form 

Ifd FRT,D(RA) 



50 


FRT 


RA 




D 




0 


6 


11 


16 




31 



1f RA = 0 then b ^ 0 
else b <- (RA) 

EA ^ b + EXTS(D) 
FRT MEM(EA, 8) 

Let the effective address (EA) be the sum (RAI0)+D. 
The doubleword in storage addressed by EA is placed into register 
FRT 

Special Registers Altered 

None 
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Load Floating-Point Double Indexed X-form 

Ifdx FRT,RA,RB 



31 


FRT 


RA 


RB 


599 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b <- 0 
else b <r- (RA) 

EA ^ b + (RB) 
FRT f- MEM(EA. 8) 

Let the effective address (EA) be the sum (RAI0)+(RB). 
The doubleword in storage addressed by EA is placed into register 
FRT. 

Special Registers Altered 

None 



Load Floating-Point Double with Update D-form 

Ifdu FRT,D(RA) 



51 


FRT 


RA 




D 




0 


6 


11 


16 




31 



EA ^ (RA) + EXTS(D) 
FRT 4- MEM(EA, 8) 
RA <- EA 

Let the effective address (EA) be the sum (RA)+D. 
The doubleword in storage addressed by EA is placed into register 
FRT 

EA is placed into register RA. 

If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 



Load Floating-Point Double with Update Indexed X-form 

Ifdux FRT,RA,RB 



31 


FRT 


RA 


RB 


631 


/ 


0 


6 


11 


16 


21 


31 
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EA <- (RA) + (RB) 
FRT <- MEM(EA, 8) 
RA <r- EA 

Let the effective address (EA) be the sum (RA)+(RB). 
The doubleword in storage addressed by EA is placed into register 
FRT. 

EA is placed into register RA. 

If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 

4.6.3 Floating-Point Store instructions 

There are three basic forms of store instruction: single-precision, double- 
precision, and integer. The integer form is provided by the optional Store 
Floating-Point as Integer Word instruction, described on page 198. 
Because the FPRs support only floating-point double format for floating- 
point data, single-precision Store Floating-Point instructions convert 
double-precision data to single format prior to storing the operands into 
storage. The conversion steps are as follows: 

Let WORDo:3i be the word in storage written to. 

No Denortnalization Required (includes Zero / Infinity / NaN) 
if FRSi-ii > 896 or FRS1.63 = ^ then 

WORDo:i FRSo:i 

WORD2!3i ^ FRS5,34 

Denormalization Required 

if 874 < FRSi-ii < 896 then 
sign <- FRSq 
exp <r- FRSi-ii - 1023 
frac <- Obi |i FRSi2:63 
denormalize the operand 
do while exp < -126 
frac <— ObO || fraco:62 
exp <— exp + 1 
WORDq <r- sign 
WORDi.g ^ 0x00 
WORD9!3i <- fraci.23 
else WORdV undefined 
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Notice that if the value to be stored by a single-precision Store Float- 
ing-Point instruction is larger in magnitude than the maximum number 
representable in single format, the first case above (No Denormalization 
Required) applies. The result stored in WORD is then a well-defined 
value, but it is not numerically equal to the value in the source register 
(i.e., the result of a single-precision Load Floating-Point from WORD 
will not compare equal to the contents of the original source register). 

For double-precision Store Floating-Point instructions and for the 
Store Floating-Point as Integer Word instruction, no conversion is 
required, as the data from the FPR are copied directly into storage. 

Many of the Store Floating-Point instructions have an "update" form, 
in which register RA is updated with the effective address. For these 
forms, if RA^^O, the effective address is placed into register RA. 

Note: Recall that RA and RB denote general purpose registers, while 
FRS denotes a floating-point register. 

Byte order of PowerPC is Big-Endian by default; see Appendix D, "Lit- 
tle-Endian Byte Ordering," on page 233 for PowerPC systems operated 
with Little-Endian byte ordering. 



Store Floating-Point Single D-form 

stfs FRS,D(RA) 



52 


FRS 


RA 




D 




0 


6 


11 


16 




31 



if RA = 0 then b <- 0 
else b <r- (RA) 

EA ^ b + EXTS(D) 
MEM(EA, 4) ^ SINGLE(FRS) 

Let the effective address (EA) be the sum (RAIO)+D. 
The contents of register FRS are converted to single format (see 
page 173) and stored into the word in storage addressed by EA. 

Special Registers Altered 

None 
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Store Floating-Point Single Indexed X-form 

stfsx FRS,RA,RB 



31 


FRS 


RA 


RB 


663 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b <- 0 
else b «- (RA) 

EA f- b + (RB) 
MEM(EA, 4) SINGLE(FRS) 

Let the effective address (EA) be the sum (RAIO)+(RB). 
The contents of register FRS are converted to single format (see 
page 173) and stored into the word in storage addressed by EA. 

Special Registers Altered 

None 



Store Floating-Point Single with Update D-form 

stfsu FRS,D(RA) 



53 


FRS 


RA 




D 




0 


6 


11 


16 




31 



EA ^ (RA) + EXT$(D) 
MEM(EA, 4) SINGLE(FRS) 
RA f- EA 

Let the effective address (EA) be the sum (RA)+D. 
The contents of register FRS are converted to single format (see 
page 173) and stored into the word in storage addressed by EA. 
EA is placed into register RA. 
If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 



Store Floating-Point Single with Update Indexed X-form 

stfsux FRS,RA,RB 



31 


FRS 


RA 


RB 


695 


/ 


0 


6 


11 


16 


21 


31 
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EA «- (RA) + (RB) 
MEM(EA, 4) <- SINGLE(FRS) 
RA <- EA 

Let the effective address (EA) be the sum (RA)+(RB). 
The contents of register FRS are converted to single format (see 
page 173) and stored into the word in storage addressed by EA. 
EA is placed into register RA. 
If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 



Store Floating-Point Double D-form 

stfd FRS,D(RA) 



54 


FRS 


RA 




D 




0 


6 


11 


16 




31 



if RA = 0 then b f- 0 
else b <- (RA) 

EA «- b + EXTS(D) 
MEM(EA, 8) «- (FRS) 

Let the effective address (EA) be the sum (RAI0)+D. 
The contents of register FRS are stored into the doubleword in storage 
addressed by EA. 

Special Registers Altered 

None 



Store Floating-Point Double Indexed X-form 

stfdx FRS,RA,RB 



31 


FRS 


RA 


RB 


727 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b ^ 0 
else b <r- (RA) 

EA ^ b + (RB) 
MEM(EA, 8) <r- (FRS) 

Let the effective address (EA) be the sum (RAI0)+(RB). 
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The contents of register FRS are stored into the doubleword in storage 
addressed by EA. 

Special Registers Altered 

None 



Store Floating-Point Double with Update D-form 

stfdu FRS,D(RA) 



55 


FRS 


RA 




D 




0 


6 


11 


16 




31 



EA <- (RA) + EXTS(D) 
MEM(EA, 8) <r- (FRS) 
RA f- EA 

Let the effective address (EA) be the sum (RA)+D. 
The contents of register FRS are stored into the doubleword in storage 
addressed by EA. 

EA is placed into register RA. 

If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 



Store Floating-Point Double with Update Indexed X-form 

stfdux FRS,RA,RB 



31 


FRS 


RA 


RB 


759 


/ 
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11 
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21 


31 



EA <- (RA) + (RB) 
MEM(EA, 8) «- (FRS) 
RA <r- EA 

Let the effective address (EA) be the sum (RA)+(RB). 
The contents of register FRS are stored into the doubleword in storage 
addressed by EA. 

EA is placed into register RA. 

If RA=0, the instruction form is invalid. 

Special Registers Altered 

None 
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4.6.4 Floating-Point iViove instructions 

These instructions copy data from one floating-point register to another, 
altering the sign bit (bit 0) as described below for fneg, fabs, and fnabs. 
These instructions treat NaNs just like any other kind of value (e.g., the 
sign bit of a NaN may be altered by fneg, fabs, and fnabs). These instruc- 
tions do not alter the FPSCR. 



Floating Move Register X-form 



fmr 
fmr. 



FRT,FRB 
FRT,FRB 



(Rc=0) 
(Rc=l) 



63 


FRT 


/// 


FRB 


72 


Rc 


0 


6 
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16 


21 
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The contents of register FRB are placed into register FRT. 



Special Registers Altered 

CRl 



(If Rc=l) 



Floating Negate X-form 



fneg 
fneg. 



FRT,FRB 
FRT,FRB 



(Rc=0) 
(Rc=l) 



63 


FRT 


III 


FRB 


40 


Rc 


0 


6 


11 


16 


21 


31 



The contents of register FRB with bit 0 inverted are placed into regis- 
ter FRT. 



Special Registers Altered 

CRl (if Rc=l) 



Floating Absolute Value X-form 

fabs FRT,FRB (Rc=0) 

fabs. FRT,FRB (Rc=l) 



63 


FRT 


III 


FRB 
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Rc 


0 


6 


11 


16 


21 


31 



The contents of register FRB with bit 0 set to zero are placed into reg- 
ister FRT. 
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Special Registers Altered 

CRl (if Rc=l) 

Floating Negative Absolute Value X-form 



fnabs 


FRT,FRB 




(Rc= 


0) 


fnabs. 


FRT,FRB 




(Rc= 


1) 


63 


FRT 


III 


FRB 
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Rc 


0 


6 


11 


16 


21 


31 



The contents of register FRB with bit 0 set to one are placed into regis- 
ter FRT. 



Special Registers Altered 

CRl (if Rc=l) 



4.6.5 Floating-Point Arithmetic instructions 

Floating-Point Elementary Arithmetic instructions 
Floating Add [Single] A-form 

fadd FRT,FRA,FRB (Rc=0) 

fadd. FRT,FRA,FRB (Rc=l) 

[POWER mnemonics: fa, fa.] 



63 


FRT 


FRA 


FRB 


III 


21 


Rc 


0 


6 


11 


16 


21 


26 
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fadds 


FRT,FRA,FRB 






(Rc=0) 


fadds. 


FRT,FRA,FRB 






(Rc=l) 


59 


FRT 


FRA 


FRB 


III 


21 


Rc 


0 


6 


11 


16 


21 


26 
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The floating-point operand in register FRA is added to the floating- 
point operand in register FRB. 

If the most significant bit of the resultant significand is not 1, the result 
is normalized. The result is rounded to the target precision under control 
of the Floating-Point Rounding Control field RN of the FPSCR and 
placed into register FRT. 



Book I PowerPC User Instruction Set Architecture 



180 



Chapter 4 Floating-Point Processor 



Floating-point addition is based on exponent comparison and addition 
of the two significands. The exponents of the two operands are com- 
pared, and the significand accompanying the smaller exponent is shifted 
right, with its exponent increased by one for each bit shifted, until the 
two exponents are equal. The two significands are then added or sub- 
tracted as appropriate, depending on the signs of the operands, to form 
an intermediate sum. All 53 bits in the significand as well as all three 
guard bits (G, R, and X) enter into the computation. 

If a carry occurs, the sum's significand is shifted right one bit position 
and the exponent is increased by one. 

FPSCRppRp is set to the class and sign of the result, except for Invalid 
Operation Exceptions when FPSCRye=1. 

Special Registers Altered 

FPRF FR FI 

FX OX UX XX 
VXSNAN VXISI 

CRl (if Rc=l) 

Floating Subtract [Single] A-form 

fsub FRT,FRA,FRB 
fsub. FRT,FRA,FRB 

[POWER mnemonics: fs, fs.] 



63 


FRT 


FRA 


FRB 


III 


20 


Rc 


0 


6 


11 


16 


21 
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31 


fsubs 


FRT,FRA,FRB 






(Rc=0) 


fsubs. 


FRT,FRA,FRB 






(Rc=l) 


59 


FRT 


FRA 


FRB 


III 


20 


Rc 


0 


6 


11 


16 


21 


26 


31 



The floating-point operand in register FRB is subtracted from the 
floating-point operand in register FRA. 

If the most significant bit of the resultant significand is not 1, the result 
is normalized. The result is rounded to the target precision under control 
of the Floating-Point Rounding Control field RN of the FPSCR and 
placed into register FRT. 

The execution of the Floating Subtract instruction is identical to that 
of Floating Add, except that the contents of FRB participate in the opera- 
tion with the sign bit (bit 0) inverted. 



(Rc=0) 
(Rc=l) 
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FPSCRppRp is set to the class and sign of the result, except for Invalid 
Operation Exceptions when FPSCRve=1. 

Special Registers Altered 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXISI 

CRl (if Rc=l) 

Floating Multiply [Single] A-form 

fmul FRT,FRA,FRC 
fmul. FRT,FRA,FRC 

[POWER mnemonics: fm, fm.] 



63 


FRT 


FRA 


III 


FRC 


25 


Rc 


0 


6 


11 


16 


21 
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31 


fmuls 


FRT,FRA,FRC 






(Rc=0) 


fmuls. 


FRT,FRA,FRC 






(Rc=l) 


59 


FRT 


FRA 


III 


FRC 


25 


Rc 


0 


6 


11 


16 


21 


26 


31 



The floating-point operand in register FRA is multiplied by the float- 
ing-point operand in register FRC. 

If the most significant bit of the resultant significand is not 1, the result 
is normalized. The result is rounded to the target precision under control 
of the Floating-Point Rounding Control field RN of the FPSCR and 
placed into register FRT. 

Floating-point multiplication is based on exponent addition and multi- 
plication of the significands. 

FPSCRppRp is set to the class and sign of the result, except for Invalid 
Operation Exceptions when FPSCRy£=1. 

Special Registers Altered 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXIMZ 

CRl (if Rc=l) 



(Rc=0) 
(Rc=l) 
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Floating Divide [Single] A-form 

fdiv FRT,FRA,FRB (Rc=0) 

fdiv. FRT,FRA,FRB (Rc=l) 

[POWER mnemonics: fd, fd.] 



63 


FRT 


FRA 


FRB 


III 


18 


Rc 


0 


6 


11 


16 


21 
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31 


fdivs 


FRT,FRA,FRB 






(Rc= 


0) 


fdivs. 


FRT,FRA,FRB 






(Rc= 


1) 


59 


FRT 


FRA 


FRB 


III 


18 


Rc 


0 
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11 


16 


21 


26 


31 



The floating-point operand in register FRA is divided by the floating- 
point operand in register FRB. The remainder is not suppHed as a result. 

If the most significant bit of the resukant significand is not 1, the result 
is normalized. The result is rounded to the target precision under control 
of the Floating-Point Rounding Control field RN of the FPSCR and 
placed into register FRT. 

Floating-point division is based on exponent subtraction and division 
of the significands. 

FPSCRppRp is set to the class and sign of the result, except for Invalid 
Operation Exceptions when FPSCRy£=1 and Zero Divide Exceptions 
vs^hen FPSCRze=1. 

Special Registers Altered 

FPRF FR FI 

FX OX UX ZX XX 

VXSNAN VXIDI VXZDZ 

CRl (if Rc=l) 

Floating-Point Multiply-Add instructions 

These instructions combine a multiply operation and an add operation 
vi^ithout an intermediate rounding operation. The fraction part of the 
intermediate product is 106 bits v^ide (L bit, Fraction), and all 106 bits 
take part in the add/subtract portion of the instruction. 
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Floating Multiply-Add [Single] A-form 

fmadd FRT,FRA,FRC,FRB (Rc=0) 

fmadd. FRT,FRA,FRC,FRB (Rc=l) 

[POWER mnemonics: fma, fma.] 
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FRT 


FRA 


FRB 


FRC 


29 


Rc 


0 
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31 


fmadds 


FRT,FRA,FRC,FRB 






(Rc= 


0) 


fmadds. 


FRT,FRA,FRC,FRB 






(Rc= 


1) 
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FRB 


FRC 


29 


Rc 


0 


6 


11 


16 


21 


26 


31 



The operation 

FRT ^ [(FRA)x(FRC)] + (FRB) 

is performed. 

The floating-point operand in register FRA is multipHed by the float- 
ing-point operand in register FRC. The floating-point operand in register 
FRB is added to this intermediate result. 

If the most significant bit of the resultant significand is not 1, the result 
is normalized. The result is rounded to the target precision under control 
of the Floating-Point Rounding Control field RN of the FPSCR and 
placed into register FRT. 

FPSCRppRF is set to the class and sign of the result, except for Invalid 
Operation Exceptions when FPSCRy£=1. 



Special Registers Altered 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXISI VXIMZ 

CRl (1f Rc=l) 
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Floating Multiply-Subtract [Single] A-form 

fmsub FRT,FRA,FRC,FRB (Rc=0) 

fmsub. FRT,FRA,FRC,FRB (Rc= 1 ) 

[POWER mnemonics: fms, fms.] 



63 


FRT 


FRA 


FRB 


FRC 


28 


Rc 


0 


6 


11 


16 


21 


26 


31 


fmsubs 


FRT,FRA,FRC,FRB 






(Rc=0) 


fmsubs. 


FRT,FRA,FRC,FRB 






(Rc=l) 


59 


FRT 


FRA 


FRB 


FRC 


28 


Rc 


0 


6 


11 


16 


21 


26 


31 



The operation 

FRT <r- [(FRA)x(FRC)] - (FRB) 

is performed. 

The floating-point operand in register FRA is multiphed by the float- 
ing-point operand in register FRC. The floating-point operand in register 
FRB is subtracted from this intermediate result. 

If the most significant bit of the resultant significand is not 1, the result 
is normalized. The result is rounded to the target precision under control 
of the Floating-Point Rounding Control field RN of the FPSCR and 
placed into register FRT. 

FPSCRppRp is set to the class and sign of the result, except for Invahd 
Operation Exceptions when FPSCRve=1. 



Special Registers Altered 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXISI VXIMZ 

CRl (if Rc=l) 
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Floating Negative Multiply-Add [Single] A-form 

fnmadd FRT,FRA,FRC,FRB (Rc=0) 

fnmadd. FRT,FRA,FRC,FRB (Rc=l) 

[POWER mnemonics: fnma, fnma.] 



63 


FRT 


FRA 


FRB 


FRC 


31 


Rc 


0 
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26 


31 


fnmadds 


FRT,FRA,FRC,FRB 






(Rc=0) 


fnmadds. 


FRT,FRA,FRC,FRB 






(Rc=l) 
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FRT 
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FRB 


FRC 
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16 


21 


26 
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The operation 

FRT <- - ( [(FRA)x(FRC)] + (FRB) ) 

is performed. 

The floating-point operand in register FRA is multipHed by the float- 
ing-point operand in register FRC. The floating-point operand in register 
FRB is added to this intermediate result. 

If the most significant bit of the resultant significand is not 1, the result 
is normalized. The result is rounded to the target precision under control 
of the Floating-Point Rounding Control field RN of the FPSCR, then 
negated and placed into register FRT. 

This instruction produces the same result as would be obtained by 
using the Floating Multiply-Add instruction and then negating the result, 
with the following exceptions: 

■ QNaNs propagate with no effect on their "sign" bit. 

■ QNaNs that are generated as the result of a disabled Invalid Opera- 
tion Exception have a "sign" bit of 0. 

■ SNaNs that are converted to QNaNs as the result of a disabled Invalid 
Operation Exception retain the "sign" bit of the SNaN. 

FPSCRppRp is set to the class and sign of the result, except for Invalid 
Operation Exceptions when FPSCRy£=1. 
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Special Registers Altered 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXISI VXIMZ 

CRl (if Rc=l) 

Floating Negative Multiply-Subtract [Single] A-form 

fnmsub FRT,FRA,FRC,FRB 
fnmsub. FRT,FRA,FRC,FRB 

[POWER mnemonics: fnms, fnms.] 
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FRT 


FRA 


FRB 


FRC 


30 


Rc 


0 
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fnmsubs 


FRT,FRA,FRC,FRB 






(Rc=0) 


fnmsubs. 


FRT,FRA,FRC,FRB 






(Rc=l) 
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FRT 
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16 
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26 
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The operation 

FRT <r- - ( [(FRA)x(FRC)] - (FRB) ) 

is performed. 

The floating-point operand in register FRA is multiplied by the float- 
ing-point operand in register FRC. The floating-point operand in register 
FRB is subtracted from this intermediate result. 

If the most significant bit of the resultant significand is not 1, the result 
is normalized. The result is rounded to the target precision under control 
of the Floating-Point Rounding Control field RN of the FPSCR, then 
negated and placed into register FRT. 

This instruction produces the same result as would be obtained by 
using the Floating Multiply-Subtract instruction and then negating the 
result, with the following exceptions; 

■ QNaNs propagate with no effect on their "sign" bit. 

■ QNaNs that are generated as the result of a disabled Invalid Opera- 
tion Exception have a "sign" bit of 0. 

■ SNaNs that are converted to QNaNs as the result of a disabled Invalid 
Operation Exception retain the "sign" bit of the SNaN. 



(Rc=0) 
(Rc=l) 
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FPSCRppRF is set to the class and sign of the result, except for Invalid 
Operation Exceptions when FPSCRy£=1. 

Special Registers Altered 

FPRF FR FI 
FX OX UX XX 
VXSNAN VXISI VXIMZ 

CRl (if Rc=l) 



4.6.6 Floating-Point Rounding and 
Conversion instructions 

Floating Round to Single-Precision X-form 



frsp 
frsp. 



FRT,FRB 
FRT,FRB 



(Rc=0) 
(Rc=l) 
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FRT 
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FRB 


12 


Rc 


0 
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16 


21 
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Programming Note 

Examples of uses of the 
conversion instructions 
can be found in 
Appendix E.3 , "Floating- 
Point Conversions/' on 
page 259. 



If it is already in single-precision range, the floating-point operand in 
register FRB is placed into register FRT. Otherwise, the floating-point 
operand in register FRB is rounded to single-precision using the rounding 
mode specified by FPSCRr^sj and placed into register FRT. 

The rounding is described fully in Appendix B.l, "Floating-Point 
Round to Single-Precision Model," on page 203. 

FPSCRppRp is set to the class and sign of the result, except for Invalid 
Operation Exceptions when FPSCRye=1. 



Special Registers Altered 

FPRF FR FI 
FX OX UX XX 
VXSNAN 
CRl 



Floating Convert To Integer Doubleword X-form 



fetid 
fetid. 



FRT,FRB 
FRXFRB 



(if Rc=l) 



(Rc=0) 
(Rc=l) 
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The floating-point operand in register FRB is converted to a 64-bit 
signed fixed-point integer, using the rounding mode specified by 
FPSCRrn, and placed into register FRT. 

If the operand in FRB is greater than 2^"^-l, then FRT is set to 
0x7FFF_FFFF_FFFF_FFFR If the operand in FRB is less than -2^^, then 
FRT is set to Ox8000_0000_0000_0000. 

The conversion is described fully in Appendix B.2, "Floating-Point 
Convert to Integer Model," on page 209. 

Except for enabled Invalid Operation Exceptions, FPSCRppRp is unde- 
fined. FPSCRpR is set if the result is incremented when rounded. FPSCRpj 
is set if the result is inexact. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 



Special Registers Altered 

FPRF(undefined) FR FI 
FX XX 

VXSNAN VXCVI 

CRl (if Rc=l) 



Floating Convert To Integer Doubleword with round toward 
Zero X-form 

fctidz FRT,FRB (Rc=0) 

fctidz. FRT,FRB (Rc=l) 
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The floating-point operand in register FRB is converted to a 64-bit 
signed fixed-point integer, using the rounding mode Round toward Zero, 
and placed into register FRT. 

If the operand in FRB is greater than 2^*^-1, then FRT is set to 
0x7FFF_FFFF_FFFF_FFFE If the operand in FRB is less than -2^^, then 
FRT is set to Ox8000_0000_0000^0000. 

The conversion is described fully in Appendix B.2, "Floating-Point 
Convert to Integer Model," on page 209. 

Except for enabled Invalid Operation Exceptions, FPSCRppRp is unde- 
fined. FPSCRpR is set if the result is incremented when rounded. FPSCRpj 
is set if the result is inexact. 

This instruction is defined only for 64-bit implementations. Using it on 
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a 32-bit implementation will cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

FPRF(undefined) FR FI 

FX XX 

VXSNAN VXCVI 

CRl (1f Rc=l) 

Floating Convert To integer Word X-form 

fctiw FRT,FRB 
fctiw. FRT,FRB 

[POWER2 mnemonics: fcir, fcir.] 
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The floating-point operand in register FRB is converted to a 32-bit 
signed fixed-point integer, using the rounding mode specified by 
FPSCRrn? placed in bits 32:63 of register FRT. Bits 0:31 of register 
FRT are undefined. 

If the operand in FRB is greater than 2^^-l, then bits 32:63 of FRT are 
set to 0x7FFF_FFFR If the operand in FRB is less than -2^^, then bits 
32:63 of FRT are set to 0x8000_0000. 

The conversion is described fully in Appendix B.2, "Floating-Point 
Convert to Integer Model," on page 209. 

Except for enabled Invalid Operation Exceptions, FPSCRppRp is unde- 
fined. FPSCRpR is set if the result is incremented when rounded. FPSCRpj 
is set if the result is inexact. 

Special Registers Altered 

FPRF(undefined) FR FI 
FX XX 

VXSNAN VXCVI 

CRl (1f Rc=l) 



(Rc=0) 
(Rc=l) 
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Floating Convert To Integer Word with round toward 
Zero X-form 

fctiwz FRT,FRB (Rc=0) 

fctiwz. FRT,FRB (Rc=l) 

[POWER2 mnemonics: fcirz, fcirz.] 
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The floating-point operand in register FRB is converted to a 32-bit 
signed fixed-point integer, using the rounding mode Round toward Zero, 
and placed in bits 32;63 of register FRT. Bits 0:31 of register FRT are 
undefined. 

If the operand in FRB is greater than 2^^-l, then bits 32:63 of FRT are 
set to 0x7FFF_FFFF. If the operand in FRB is less than -2^^, then bits 
32:63 of FRT are set to Ox8000_0000. 

The conversion is described fully in Appendix B.2, "Floating-Point 
Convert to Integer Model," on page 209. 

Except for enabled Invalid Operation Exceptions, FPSCRppRp is unde- 
fined. FPSCRpR is set if the result is incremented when rounded. FPSCRpj 
is set if the result is inexact. 



Special Registers Altered 

FPRF(undefined) FR FI 
FX XX 

VXSNAN VXCVI 

CRl (if Rc=l) 



Floating Convert From Integer Doubleword X-form 



fcfid 


FRT,FRB 




(Rc= 


0) 


fcfid. 


FRT,FRB 




(Rc= 


1) 
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The 64-bit signed fixed-point operand in register FRB is converted to 
an infinitely precise floating-point integer. If the result of the conversion is 
already in double-precision range, it is placed into register FRT. Other- 
wise the result of the conversion is rounded to double-precision, using the 
rounding mode specified by FPSCRrjs^, and placed into register FRT. 
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The conversion is described fully in Appendix B.3, "Floating-Point 
Convert from Integer Model," on page 212. 

FPSCRppRF is set to the class and sign of the result. FPSCRpR is set if 
the result is incremented when rounded. FPSCRpj is set if the result is 
inexact. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation w^ill cause the system illegal instruction error 
handler to be invoked. 

Special Registers Altered 

FPRF FR FI 
FX XX 

CRl (if Rc=l) 



4.6.7 Floating-Point Compare instructions 

The floating-point Compare instructions compare the contents of two 
floating-point registers. Comparison ignores the sign of zero (i.e., regards 
+0 as equal to -0). The comparison can be ordered or unordered. 

The comparison sets one bit in the designated CR field to 1 and the 
other three to 0. The FPCC is set in the same way. 

The CR field and the FPCC are interpreted as follows: 

Bit Name Description 

0 FL (FRA) < (FRB) 

1 FG (FRA) > (FRB) 

2 FE (FRA) = (FRB) 

3 FU (FRA) ? (FRB) (unordered) 



Floating Compare Unordered X-form 

fcmpu BF,FRA,FRB 
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if (FRA) is a NaN or 

(FRB) is a NaN then c <- ObOOOl 

else if (FRA) < (FRB) then c OblOOO 

else if (FRA) > (FRB) then c <- ObOlOO 
else c <- ObOOlO 

FPCC ^ c 

CR4xBF:4xBF+3 ^ ^ 

if (FRA) is an SNaN or 
(FRB) is an SNaN then 
VXSNAN <- 1 

The floating-point operand in register FRA is compared to the float- 
ing-point operand in register FRB. The result of the compare is placed 
into CR field BF and the FPCC. 

If either of the operands is a NaN, either quiet or signaling, then CR 
field BF and the FPCC are set to reflect unordered. If either of the oper- 
ands is a Signalling NaN, then VXSNAN is set. 

Special Registers Altered 

CR field BF 
FPCC 

FX 

VXSNAN 



Floating Compare Ordered X-form 

fcmpo BF,FRA,FRB 
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BF 
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FRA 


FRB 
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if (FRA) is a NaN or 

(FRB) is a NaN then c <r- ObOOOl 
else if (FRA) < (FRB) then c OblOOO 
else if (FRA) > (FRB) then c ^ ObOlOO 
else c <- ObOOlO 

FPCC <r- c 
CR4xBF:4xBF+3 C 

if (FRA) is an SNaN or 
(FRB) is an SNaN then 
VXSNAN ^ 1 

if VE = 0 then VXVC <- 1 
else if (FRA) is a QNaN or 

(FRB) is a QNaN then VXVC 1 
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The floating-point operand in register FRA is compared to the float- 
ing-point operand in register FRB. The result of the compare is placed 
into CR field BF and the FPCC. 

If either of the operands is a NaN, either quiet or signaling, then CR 
field BF and the FPCC are set to reflect unordered. If either of the oper- 
ands is a Signalling NaN, then VXSNAN is set and, if InvaUd Operation 
is disabled (VE=0), VXVC is set. If neither operand is a Signaling NaN 
but at least one operand is a Quiet NaN, then VXVC is set. 

Special Registers Altered 

CR field BF 

FPCC 

FX 

VXSNAN VXVC 

4.6.8 Floating-Point Status and Control 
Register Instructions 

Every Floating-Point Status and Control Register instruction appears to 
synchronize the effects of all floating-point instructions executed by a 
given processor. Executing a Floating-Point Status and Control Register 
instruction ensures that all floating-point instructions previously initiated 
by the given processor appear to have completed before the Floating- 
Point Status and Control Register instruction is initiated, and that no 
subsequent floating-point instructions appear to be initiated by the given 
processor until the Floating-Point Status and Control Register instruction 
has completed. In particular; 

■ All exceptions that will be caused by the previously initiated instruc- 
tions are recorded in the FPSCR before the Floating-Point Status and 
Control Register instruction is initiated. 

■ All invocations of the system floating-point enabled exception error 
handler that will be caused by the previously initiated instructions 
have occurred before the Floating-Point Status and Control Register 
instruction is initiated. 

■ No subsequent floating-point instruction that depends on or alters the 

settings of any FPSCR bits appears to be initiated until the Floating- 
Point Status and Control Register instruction has completed. 

(Floating-point Storage Access instructions are not affected.) 
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Move From FPSCR X-form 



Programming Note 

When FPSCRo:3 's 
specified for mtfsfi, bits 0 
(FX) and 3 (OX) are set to 
the values of Uq and U3 
(i.e., even if this 
instruction causes OX to 
change from 0 to 1, FX is 
set from Uq and not by 
the usual rule that FX is 
set to 1 when an 
exception bit changes 
from 0 to 1). Bits 1 and 2 
(FEX and VX) are set 
according to the usual 
rule, given on page 137, 
and not from Ui..2- 



mffs 
mffs. 



FRT 
FRT 



(Rc=0) 
(Rc=l) 



63 


FRT 


III 


III 
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Rc 
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The contents of the FPSCR are placed into bits 32:63 of register FRT. 
Bits 0:31 of register FRT are undefined. 



Special Registers Altered 

CRl 

Move to Condition Register from FPSCR X-form 

mcrfs BF,BFA 



(if Rc=l) 
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BF 


// 


BFA 


// 


III 
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The contents of FPSCR field BFA are copied to CR field BF. All excep- 
tion bits copied (except FEX and VX) are set to 0 in the FPSCR. 



Special Registers Altered 

CR field BF 
FX OX 

UX ZX XX VXSNAN 

VXISI VXIDI VXZDZ VXIMZ 

VXVC 

VXSOFT VXSQRT VXCVI 
Move To FPSCR Field immediate X-form 



mtfsfi 
mtfsfi. 



BF,U 
BRU 



(if BFA=0) 

(if BFA=1) 

(if BFA=2) 

(if BFA=3) 

(if BFA==5) 



(Rc=0) 
(Rc=l) 
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BF 


// 


III 


u 


/ 
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The value of the U field is placed into FPSCR field BR 
FPSCRpx is altered only if BF = 0. 
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Special Registers Altered 

FPSCR field BF 
CRl 

iVIove To FPSCR Fields XFL-form 



mtfsf 
mtfsf. 



FLM,FRB 
FLM,FRB 



(if Rc=l) 



(Rc=0) 

(RC=:1) 
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FRB 
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The contents of bits 32:63 of register FRB are placed into the FPSCR 
under control of the field mask specified by FLM. The field mask identi- 
fies the 4-bit fields affected. Let i be an integer in the range 0-7. If 
FLMi=l then FPSCR field i (FPSCR bits 4xi through 4xi+3) is set to the 
contents of the corresponding field of the low-order 32 bits of register 
FRB. 

FPSCRpx is altered only if FLMq = 1. 



Special Registers Altered 

FPSCR fields selected by mask 
CRl 

iVIove To FPSCR Bit 0 X-form 



mtfsbO 
mtfsbO. 



BT 
BT 



(if Rc=l) 



(Rc=0) 
(Rc=l) 



63 


BT 


HI 


III 


70 


Rc 


0 


6 


11 


16 


21 


31 



Programming Note 

When FPSCRo:3 is 
specified for mtfsf, bits 0 
(FX) and 3 (OX) are set to 
the values of (FRB)32 and 
(FRB)35 (i.e., even if this 
instruction causes OX to 
change from 0 to 1, FX is 
set from (FRB)32 and not 
by the usual rule that FX 
is set to 1 when an 
exception bit changes 
from 0 to 1). Bits 1 and 2 
(FEX and VX) are set 
according to the usual 
rule, given on page 137, 
and not from (FRB)33.34. 

Programming Note 

Updating fewer than all 
eight fields of the FPSCR 
may result in 
substantially poorer 
performance on some 
implementations than 
updating all the fields. 

Programming Note 

Bits 1 and 2 (FEX and VX) 
cannot be explicitly reset 
by mtfsbO. 



Bit BT of the FPSCR is set to 0. 



Special Registers Altered 

FPSCR bit BT 
CRl 



(if Rc=l) 
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Programming Note 

Bits 1 and 2 (FEX and VX) 
cannot be explicitly set by 
mtfsbl. 



Move To FPSCR Bit 1 X-form 



mtfsbl 
mtfsbl. 



BT 
BT 



(Rc=0) 
(Rc=l) 



63 


BT 


HI 


III 


38 
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Bit BT of the FPSCR is set to 1. 



Special Registers Altered 

FPSCR bits BT and FX 
CRl 



(if Rc=l) 
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The instructions described in this appendix are optional. If an instruction 
is implemented that matches the semantics of an instruction described 
here, the implementation should be as specified here. The optional 
instructions are divided into two groups. Additional groups may be 
defined in the future. 

■ General Purpose group: fsqrt and fsqrts. 

■ Graphics group: stfiwx, fres, frsqrte, and fsel. 

If an implementation claims to support a given group, it must implement 
all the instructions in the group. 

A.1 Floating-Point Processor 
instructions 

A.1.1 Floating-Point Store Instruction 

Byte ordering on PowerPC is Big-Endian by default. See Appendix D, 
"Little-Endian Byte Ordering," on page 233 for the effects of operating a 
PowerPC system with Little-Endian byte ordering. 
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Store Floating-Point as integer Word indexed X-form 

stfiwx FRS,RA,RB 



31 


FRS 


RA 


RB 


983 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b <~ 0 
else b <- (RA) 

EA <- b + (RB) 
MEM(EA. 4) «- (FRS)32:63 

Let the effective address (EA) be the sum (RAI0)+(RB). 

The contents of the low-order 32 bits of register FRS are stored, with- 
out conversion, into the word in storage addressed by EA. 

If the contents of register FRS were produced, either directly or indi- 
rectly, by a Load Floating-Point Single instruction, a single-precision 
arithmetic instruction, or frsp^ then the value stored is undefined. (The 
contents of register FRS are produced directly by such an instruction if 
FRS is the target register for the instruction. The contents of register FRS 
are produced indirectly by such an instruction if FRS is the final target 
register of a sequence of one or more Floating-Point Move instructions, 
with the input to the sequence having been produced directly by such an 
instruction.) 

Special Registers Altered 

None 



A.1.2 Floating-Point Arithmetic instructions 

Floating-Point Elententary Arithmetic Instructions 
Floating Square Root [Single] A-form 

fsqrt FRT,FRB (Rc=0) 

fsqrt. FRT,FRB (Rc=l) 



63 


FRT 


III 


FRB 


III 


22 


Rc 


0 
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11 


16 
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26 
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fsqrts FRT,FRB (Rc=0) 

fsqrts. FRT,FRB (Rc=l) 



59 


FRT 


III 


FRB 


III 


22 


Rc 


0 


6 


11 


16 


21 


26 
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The square root of the floating-point operand in register FRB is 
placed into register FRT. 

If the most significant bit of the resultant significand is not 1, the result 
is normalized. The result is rounded to the target precision under control 
of the Floating-Point Rounding Control field RN of the FPSCR and 
placed into register FRT. 

Operation with various special values of the operand is summarized 
below. 



Operand 


Result 


Exception 


— oo 


QNaN^ 


VXSQRT 


<0 


QNaN^ 


VXSQRT 


-0 


-0 


None 






None 


SNaN 


QNaN^ 


VXSNAN 


QNaN 


QNaN 


None 


^No result if FPSCRyE = 


1. 





FPSCRppRp is set to the class and sign of the result, except for Invalid 
Operation Exceptions when FPSCRy£=1. 



Special Registers Altered 

FPRF FR FI 
FX XX 

VXSNAN VXSQRT 

CRl (if Rc=l) 
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Floating Reciprocal Estimate Single A-form 



fres 
fres. 



FRT,FRB 
FRT,FRB 



(Rc=0) 
(Rc=l) 



59 


FRT 


III 


FRB 


III 


24 


Rc 


0 
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11 


16 
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A single-precision estimate of the reciprocal of the floating-point oper- 
and in register FRB is placed into register FRT. The estimate placed into 
register FRT is correct to a precision of one part in 256 of the reciprocal 
of (FRB), i.e., 



. f estimate -1/x 
ABS 



1/x 



256 



where x is the initial value in FRB. Note that the value placed into regis- 
ter FRT may vary between implementations, and between different exe- 
cutions on the same implementation. 

Operation with various special values of the operand is summarized 
below. 



Operand 


Result 


Exception 


— oo 


-0 


None 


-0 


-col 


ZX 


+0 




zx 




+0 


None 


SNaN 


QNaN^ 


VXSNAN 


QNaN 


QNaN 


None 



iNo result if FPSCRze = 1. 
2No result ifFPSCRvE=l. 



FPSCRppRF is set to the class and sign of the result, except for Invalid 
Operation Exceptions when FPSCRve=1 and Zero Divide Exceptions 
when FPSCRzE=l. 
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Special Registers Altered 

FPRF FR(undefinecl) FI ( undef i ned ) 

FX OX UX ZX 

VXSNAN 

CRl (if Rc=l) 



Floating Reciprocal Square Root Estimate A-form 

frsqrte FRT,FRB (Rc=0) 

frsqrte. FRT,FRB (Rc=l) 



63 


FRT 


III 


FRB 


III 


26 


Rc 


0 


6 


11 


16 


21 


26 


31 



A double-precision estimate of the reciprocal of the square root of the 
floating-point operand in register FRB is placed into register FRT. The 
estimate placed into register FRT is correct to a precision of one part in 
32 of the reciprocal of the square root of (FRB), i.e., 

. „ /^estimate -1/Jx\ 1 
p < — 

where x is the initial value in FRB. Note that the value placed into regis- 
ter FRT may vary betvs^een implementations, and between different exe- 
cutions on the same implementation. 

Operation with various special values of the operand is summarized 
below. 



Operand 


Result 


Exception 


— oo 


QNaN^ 


VXSQRT 


<0 


QNaN^ 


VXSQRT 


-0 


-ool 


ZX 


+0 




ZX 


+ 00 


+0 


None 


SNaN 


QNaN^ 


VXSNAN 


QNaN 


QNaN 


None 



iNo result if FPSCRzE = l. 

^No result if FPSCRyE = 1- 
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FPSCRppRp is set to the class and sign of the result, except for Invalid 
Operation Exceptions when FPSCRy£=1 and Zero Divide Exceptions 
when FPSCRzE=l. 

Special Registers Altered 

FPRF FR(undefined) PI ( undef 1 ned ) 

FX ZX 

VXSNAN VXSQRT 

CRl (if Rc=l) 



Programming Note 

Examples of uses of the 
fsel instruction can be 
found in Sections E.3, 
"Floating-Point 
Conversions," on 
page 259, and E.4, 
"Floating-Point 
Selection," on page 264. 

Warning: 

Care must be taken in 
using fsel if IEEE 
compatibility is required, 
or if the values being 
tested can be NaNs or 
infinities; see Section 
E.4.4, "Notes," on 
page 266. 



A.1.3 Floating-Point Select Instruction 

Floating Select A-form 



fsel 
fsel. 



FRT,FRA,FRC,FRB 
FRT,FRA,FRC,FRB 



(Rc=0) 
(Rc=l) 



63 


FRT 


ERA 


FRB 


FRC 


23 


Rc 


0 


6 


11 


16 


21 


26 


31 



if (FRA) > 0.0 then FRT 
else FRT <- (FRB) 



(FRC) 



The floating-point operand in register FRA is compared to the value 
zero. If the operand is greater than or equal to zero, register FRT is set to 
the contents of register FRC. If the operand is less than zero or is a NaN, 
register FRT is set to the contents of register FRB. The comparison 
ignores the sign of zero (i.e., regards +0 as equal to -0). 



Special Registers Altered 

CRl 



(if Rc=l) 
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Floating-Point IViode 




B.I Floating-Point Round to Single- 
Precision Model 

The following describes algorithmically the operation of the Floating 
Round to Single-Precision instruction. 

If (FRB)i.ii < 897 and (FRB)i.63 > 0 then 
Do 

If FPSCRuE = 0 then goto Disabled Exponent Underflow 
If FPSCRu£ = 1 then goto Enabled Exponent Underflow 
End 

If (FRB)i.ii > 1150 and (FRB)i.ii < 2047 then 
Do 

If FPSCRqe = 0 then goto Disabled Exponent Overflow 
If FPSCRqe = 1 then goto Enabled Exponent Overflow 
End 

If (FRB)i.ii > 896 and (FRB)i.ii < 1151 then goto Normal Operand 
If (FRB)^.63 = 0 then goto Zero Operand 
If (FRB)i!ii = 2047 then 
Do 

If (FRB) 12:63 = ^ then goto Infinity Operand 
If (FRB)i2 = 1 then goto QNaN Operand 
If (FRB)i2 = 0 and (FRB)i3.63 > 0 then goto SNaN Operand 
End 
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Disabled Exponent Underflow: 
sign ^ (FRB)o 
If (FRB)i.ii = Othen 
Do 

exp < — 1022 

fraC0:52 ObO II (FRB)i2:63 
End 

If (FRB)i.n > 0 then 
Do 

exp <r- (FRB)i.ii - 1023 
fraco:52^0bi || (FRB)i2:63 
End 

Denormalize operand: 
G II R II X ^ ObOOO 
Do while exp < -126 

exp <r- exp + 1 

fraco:52 || G || R || X ^ ObO || fraco:52 II G || (R I X) 
End 

FPSCRux ^ frac24:52 II G || R || X > 0 
Round Single(sign,exp,fraco:52>G,R,X) 
EPSCRxx <- FPSCRxx ' FPSCRpi 
If fraco:52 = 0 then 
Do ' 

FRTq <r- sign 

FRTi,63 ^ 0 

If sign = 0 then FPSCRppRp <r- "+zero" 
If sign = 1 then FPSCRppRF <- "-zero" 
End 

If fraco:52 > 0 then 
Do ' 

If fracQ = 1 then 
Do 

If sign = 0 then FPSCRppRp <- "+normal number" 
If sign = 1 then FPSCRppRp <r- "-normal number" 
End 

If fracQ = 0 then 
Do 

If sign = 0 then FPSCRppRp <— "+denormalized number" 
If sign = 1 then FPSCRppRp <— "-denormalized number" 
End 
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Normalize operand: 
Do while fracQ = 0 
exp <— exp-1 
fr^co:52 <- fraci.52 II ObO 
End 
FRTq <— sign 
FRTi.ii <r- exp + 1023 

End 
Done 

Enabled Exponent Underflow: 

FPSCRux ^ 1 
sign <- (FRB)o 
If (FRB)i..ii = 0 then 
Do 

exp <r- -1022 

fraco:52 ^ ObO || (FRB)i2:63 
End 

If (FRB)i.ii>Othen 
Do 

exp <- (FRB)i.ii - 1023 

fraC0:52 ^ Obi II (FRB)i2:63 
End 

Normalize operand: 
Do while fracQ = 0 

exp exp - 1 

fraco:52 <r- fraci.52 II ObO 
End 

Round Single(sign,exp5fraco;52jO,0,0) 
FPSCRxx <r- FPSCRxx I FPSCRpi 
exp exp + 192 
FRTq <— sign 
FRTi.n <- exp + 1023 
FRTi2:63 fraci.52 

If sign = 0 then FPSCRppRp <~ "+normal number" 
If sign = 1 then FPSCRppRp <r- "-normal number" 
Done 
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Disabled Exponent Overflow: 
FPSCRox ^ 1 

If FPSCRrn = ObOO then /* Round to Nearest */ 

Do 

If (FRB)o = 0 then FRT ^ 0x7FF0^0000_0000_0000 
If (FRB)o = 1 then FRT <~ OxFFFO_0000_0000^0000 
If (FRB)o = 0 then FPSCRfprf <- "+infinity" 
If (FRB)o = 1 then FPSCRppRp <- "-infinity" 
End 

If FPSCRrn = (^bOl then /* Round toward Zero */ 

Do 

If (FRB)o = 0 then FRT <- 0x47EF_FFFF_E000_0000 
If (FRB)o = 1 then FRT ^ 0xC7EF_FFFF_E000_0000 
If (FRB)o = 0 then FPSCRppRp <- "+normal number" 
If (FRB)o = 1 then FPSCRppRp "-normal number" 
End 

If FPSCRrn = OblO then Round toward +Infinity 

Do 

If (FRB)o = 0 then FRT f- Ox7FFO_0000_0000_0000 
If (FRB)o = 1 then FRT <r- 0xC7EF_FFFF_E000_0000 
If (FRB)o = 0 then FPSCRppRp <r- "+infinity" 
If (FRB)o = 1 then FPSCRppRp ^ "-normal number" 
End 

If FPSCRrn = Obi 1 then /* Round toward -Infinity */ 

Do 

If (FRB)o = 0 then FRT 0x47EF_FFFF_E000^0000 
If (FRB)o = 1 then FRT <- OxFFFO_0000_0000^0000 
If (FRB)o = 0 then FPSCRppRp <r- "+normal number" 
If (FRB)o = 1 then FPSCRppRp <- "-infinity" 
End 

FPSCRpR <— undefined 
FPSCRpi <r- 1 

FPSCRxx ^ 1 
Done 

Enabled Exponent Overflow: 

sign ^ (FRB)o 

exp ^ (FRB)i.ii - 1023 

fraco:52 ^ Obi || (FRB)i2:63 

Round Single(sign,exp,fraco;52505050) 

FPSCRxx ^ FPSCRxx I FPSCRpi 
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Enabled Overflow: 

FPSCRox 1 
exp <— exp - 192 
FRTq <r- sign 
FRTi.ii <- exp + 1023 
FRTi2:63 ^ fraci.52 

If sign = 0 then FPSCRppRp <— "+normal number" 
If sign = 1 then FPSCRppRp <r- "-normal number" 
Done 

Zero Operand: 
FRT <- (FRB) 

If (FRB)o = 0 then FPSCRppRp f- "+zero" 
If (FRB)o = 1 then FPSCRppRp <r- "-zero" 
FPSCRpR PI ObOO 
Done 

Infinity Operand: 

FRT <- (FRB) 

If (FRB)o = 0 then FPSCRppRp <- 
If (FRB)o = 1 then FPSCRppRp ^ 
FPSCRpR PI ^ ObOO 
Done 

QNaN Operand: 

FRT <r- (FRB)o:34 || ^^0 
FPSCRppRF<-'"QNaN" 
FPSCRpR PI <r- ObOO 
Done 

SNaN Operand: 

FPSCRvxSNAN ^ 1 
If FPSCRvE = 0 then 
Do 

FRTo:ii ^ (FRB)o:ii 

FRT12 <~ 1 

FRTi3,63 ^ (FRB)i3,34 || ^^Q 
FPSCRppRp^ "QNaN" 
End 

FPSCRFRFi<-ObOO 
Done 



" +infinity" 
"-infinity" 
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Normal Operand: 
sign ^ (FRB)o 
exp <r- (FRB)i.ii - 1023 

fraco:52 ^ Obi II (FRB)i2:63 

Round Single(sign,exp,fraco:5250,0,0) 

FPSCRxx ^ FPSCRxx I FPSCRpi 

If exp > 127 and FPSCRqe = 0 then go to Disabled Exponent Overflow 
If exp > 127 and FPSCRqe = 1 ^^^^ 8^ Enabled Overflow 
FRTq <r- sign 
FRTi.ii ^exp+ 1023 
FRTi2:63 <- ^^"^^1.52 

If sign = 0 then FPSCRppRp ^ "+normal number" 
If sign = 1 then FPSCRppRp <~ "-normal number" 
Done 

Round Single (sign,exp,fracQ.^2> G,R,X): 
comparisons ignore u bits */ 
inc <- 0 
Isb <r- frac23 
gbit <— frac24 
rbit <r- frac25 

xbit^(frac26:52llQ|R||X)^0 



IfFPSCRRN 


= ObOO then 








Do 










If sign i 


II Isb II gbit II rbit 1 


1 xbit = 


Obulluu 


then inc <- 1 


If sign 1 


II Isb II gbit II rbiti 


1 xbit = 


ObuOllu 


then inc <~ 1 


If sign 1 


II Isb 11 gbit 11 rbitI 


1 xbit = 


ObuOlul 


then inc <— 1 


End 










IfFPSCRRN 


= OblO then 








Do 










If sign 


II Isb II gbit II rbit 1 


1 xbit = 


ObOuluu 


then inc <— 1 


If sign 


II Isb ii gbit ii rbit i 


1 xbit = 


ObOuulu 


then inc <— 1 


If sign 


II Isb II gbit II rbit 1 


1 xbit = 


ObOuuul 


then inc 1 


End 










IfFPSCRRN 


= Obll then 








Do 










If sign 


II Isb II gbit II rbit 1 


1 xbit = 


Obluluu 


then inc <— 1 


If sign 


II Isb II gbit II rbit 1 


1 xbit = 


Obluulu 


then inc <— 1 


If sign 


II Isb II gbit II rbitI 


1 xbit = 


Obluuul 


then inc <— 1 



End 



fraco:23 <- fraco:23 + inc 
If carry_out = 1 then 
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Do 

fraco:23 ^ Obi II fraco:22 
exp <— exp + 1 
End 

frac24:52 ^ 

FPSCRpR ^ inc 

FPSCRpi <- gbit I rbit I xbit 

Return 



B.2 Floating-Point Convert to integer 
IVIodel 

The following describes algorithmically the operation of the Floating 
Convert To Integer instructions. 

If Floating Convert To Integer Word then 
Do 

round_mode <r- FPSCRrn 
tgt_precision <— "32-bit integer" 
End 

If Floating Convert To Integer Word with Round toward Zero then 
Do 

round_mode <r- ObOl 
tgt_precision <- "32-bit integer" 
End 

If Floating Convert To Integer Doubleword then 
Do 

round_mode <— FPSCRr^ 
tgt_precision <~ "64-bit integer" 
End 

If Floating Convert To Integer Doubleword with Round toward Zero then 
Do 

round_mode <— ObOl 
tgt_precision <— "64- bit integer" 
End 

sign^(FRB)o 

If (FRB)i;n = 2047 and (FRB)i2:63 = 0 then goto Infinity Operand 
If (FRB)i.ii = 2047 and (FRB)i2 = 0 then goto SNaN Operand 
If (FRB)i.n = 2047 and (FRB)i2 = 1 then goto QNaN Operand 
If (FRB)i.ii > 1086 then goto Large Operand 
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If (FRB)i.n > 0 then exp <- (FRB)i.ii - 1023 /* exp - bias */ 
If (FRB)i!ii = 0 then exp ^ -1022 

If (FRB)i!ii > 0 then fraco:64 ^ ObOl || (FRB)i2:63 II ^^0 /* normal */ 

Z*^* need leading 0 above for later complement 
If (FRB)i.ii = 0 then fraco:64 <r- ObOO || (FRB)i2:63 II denormal V 

gbit II rbit || xbit ObOOO 

Do i=l,63-exp do the loop 0 times if exp = 63 '7 

fraco:64 II gbit || rbit || xbit <~ ObO || fraco:64 II gbit || (rbit I xbit) 
End 

Round Integer(sign,fraco:645gbit5rbit,xbit,round_mode) 

If sign = 1 then fraco.64 < ifraco.64 + 1 

/* needed leading 0 for -2^"^ < (FRB) < -2^^ V 

If tgt_precision = "32-bit integer" and fraco:64 > 2^^-l 

then goto Large Operand 
If tgt_precision = "64-bit integer" and fraco:64 > 2^^-l 

then goto Large Operand 
If tgt_precision = "32-bit integer" and fraco:64 < -2^^ 

then goto Large Operand 
If tgt_precision = "64-bit integer" and fraco:64 < -2^^ 

then goto Large Operand 

FPSCRxx <- FPSCRxx I FPSCRpi 

If tgt_precision = "32-bit integer" then 

FRT <— Oxuuuu_uuuu || frac33.54 f^' u is undefined hex digit '7 
If tgt_precision = "64-bit integer" then FRT <- fraci.54 
FPSCRppRp <r- undefined 
Done 

Round Integer ( sign, fracQ.^4,gbit, rbityxbit, roundjnode) : 
comparisons ignore u bits */ 
inc <— 0 

If round_mode = ObOO then 
Do 

If sign II frac54 || gbit || rbit || xbit = Obulluu then inc <r- 1 
If sign II frac^4 || gbit j| rbit || xbit = ObuOllu then inc 4- 1 
If sign II frac64 || gbit || rbit j| xbit = ObuOlul then inc <- 1 
End 
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If round_mode = OblO then 
Do 

If sign II frac54 || gbit || rbit || xbit = ObOuluu then inc 1 
If sign II frac54 || gbit || rbit || xbit = ObOuulu then inc <- 1 
If sign II frac64 || gbit || rbit || xbit = ObOuuul then inc <r- 1 
End 

If round_mode = Obll then 
Do 

If sign II frac64 || gbit || rbit || xbit = Obluluu then inc <r- 1 
If sign II frac54 || gbit || rbit || xbit = Obluulu then inc <- 1 
If sign II frac54 || gbit || rbit || xbit = Obluuul then inc <- 1 
End 

fraco:64 ^ fraco:64 + inc 
FPSCRpR <r- inc 
FPSCRpi <r- gbit I rbit I xbit 
Return 

Infinity Operand: 

FPSCRpR Fi VXCVI ^ ObOOl 
If FPSCRvE = 0 then Do 

If tgt_precision = "32-bit integer" then 
Do /* u is undefined hex digit */ 

If sign = 0 then FRT f- Oxuuuu_uuuu_7FFF_FFFF 
If sign = 1 then FRT <r- Oxuuuu_uuuu_8000_0000 
End 
Else 
Do 

If sign = 0 then FRT <r- 0x7FFF_FFFF_FFFF_FFFF 
If sign = 1 then FRT ^ Ox8000_0000_0000_0000 
End 

FPSCRppRp undefined 
End 
Done 

SNaN Operand: 

FPSCRpR FI VXSNAN VXCVI ^ ObOOl 1 
If FPSCRvE = 0 then 

Do /* u is undefined hex digit */ 

If tgt_precision = "32-bit integer" then FRT <r- Oxuuuu_uuuu_8 000^0000 
If tgt_precision = "64-bit integer" then FRT <r- Ox8000_0000_0000_0000 
FPSCRfprf undefined 
End 
Done 
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QNaN Operand: 

FPSCRpR Fl VXCVI ^ ObOOl 
If FPSCRvE = Othen 

Do /* u is undefined hex digit */ 

If tgt.precision = "32-bit integer" then FRT <- Oxuuuu^uuuu_8000_0000 
If tgt_precision = "64-bit integer" then FRT <- Ox8000„0000_0000_0000 
FPSCRppRF ^ undefined 
End 
Done 

Large Operand: 

FPSCRpR Fi VXCVI ^ ObOOl 
If FPSCRvE = 0 then Do 

If tgt_precision = "32-bit integer" then 
Do /* u is undefined hex digit */ 

If sign = 0 then FRT <r- 0xuuuu_uuuu_7FFF_FFFF 
If sign = 1 then FRT <r- Oxuuuu_uuuu^8 000^0000 
End 
Else 
Do 

If sign = 0 then FRT <r- 0x7FFF_FFFF_FFFF_FFFF 
If sign = 1 then FRT Ox8000_0000_0000_0000 
End 

FPSCRppRF ^ undefined 
End 
Done 



B.3 Floating-Point Convert from 
integer IVIodei 

The following describes algorithmically the operation of the Floating 
Convert From Integer Doubleword instruction. 

sign ^ (FRB)o 
exp <- 63 
fraco:63 <" (FRB) 

If fraco:53 = 0 then go to Zero Operand 
If sign = 1 then fraco:53 < ifraco;63 + ^ 
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Do while fracQ = 0 

/* do the loop 0 times if (FRB) = maximum negative integer */ 

fraco;63 ^ || ObO 

exp <- exp - 1 
End 

Round Float(sign,exp,fraco:63,FPSCRRN) 
If sign = 0 then FPSCRppRp <— "+normal number" 
If sign = 1 then FPSCRppRp <— "-normal number" 
FRTq <~ sign 

FRTi.ii <- exp + 1023 /* exp + bias */ 

Done 

Zero Operand: 

FPSCRpR PI <- ObOO 

FPSCRppRp <r- "+zero" 

FRT <r- OxOOOO_0000_0000_0000 

Done 

Round Float(sign,exp,fracQ.^^,round_mode) : 
/* comparisons ignore u bits */ 
inc <r- 0 
Isb frac52 
gbit <- frac53 
rbit <— frac54 
xbit ^ frac55.63 > 0 
If round_mode = ObOO then 
Do 

If sign II Isb II gbit || rbit j| xbit = Obulluu then inc <- 1 
If sign II Isb II gbit || rbit || xbit = ObuOllu then inc <- 1 
If sign II Isb II gbit || rbit || xbit = ObuOlul then inc <r- 1 
End 

If round_mode = OblO then 
Do 

If sign II Isb II gbit || rbit || xbit = ObOuluu then inc <r- 1 
If sign II Isb II gbit || rbit || xbit = ObOuulu then inc <- 1 
If sign II Isb II gbit || rbit || xbit = ObOuuul then inc <r- 1 

End 
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If round_mode = Obll then 
Do 

If sign II Isb II gbit || rbit || xbit = Obluluu then inc <r- 1 
If sign II Isb II gbit || rbit || xbit = Obluulu then inc <- 1 
If sign II Isb II gbit || rbit || xbit = Obluuul then inc <- 1 
End 

fraco:52 ^ fraco:52 + inc 

If carry_out = 1 then exp <— exp + 1 

FPSCRpR ^ inc 

FPSCRpi ^ gbit I rbit I xbit 

FPSCRxx <- FPSCRxx I FPSCRpi 
Return 
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Assembler Exten 
Mnemonics 




In order to make assembler language programs simpler to write and eas- 
ier to understand, a set of extended mnemonics and symbols is provided 
that defines simple shorthand for the most frequently used forms of 
Branch Conditional, Compare, Trap, Rotate and Shift, and certain other 
instructions. 

Assemblers should provide the mnemonics and symbols listed here, 
and may provide others. 

C1 Symbols 

The following symbols are defined for use in instructions (basic or 
extended mnemonics) that specify a Condition Register field or a Condi- 
tion Register bit. The first five (It, un) identify a bit number within a 
CR field. The remainder (crO, cr7) identify a CR field. An expression 
in which a CR field symbol is multiplied by 4 and then added to a bit- 
number-within-CR-field symbol can be used to identify a CR bit. 
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Symbol 


Value 


Meaning 


It 


0 


Less than 


gt 


1 


Greater than 


eq 


2 


Equal 


so 


3 


Summary overflow 


un 


3 


Unordered (after floating-point comparison) 


crO 


0 


CR Field 0 


crl 


1 


CR Field 1 


cr2 


2 


CR Field 2 


cr3 


3 


CR Field 3 


cr4 


4 


CR Field 4 


cr5 


5 


CR Field 5 


cr6 


6 


CR Field 6 


cr7 


7 


CR Field 7 



The extended mnemonics in Sections C.2.2 and C.3 require identifica- 
tion of a CR bit: if one of the CR field symbols is used, it must be multi- 
plied by 4 and added to a bit-number-within-CR-field (value in the range 
0-3, explicit or symbolic). The extended mnemonics in Sections C.2.3 and 
C.5 require identification of a CR field: if one of the CR field symbols is 
used, it must not be multiplied by 4. (For the extended mnemonics in Sec- 
tion C.2.3, the bit number within the CR field is part of the extended 
mnemonic. The programmer identifies the CR field, and the Assembler 
does the multipHcation and addition required to produce a CR bit num- 
ber for the BI field of the underlying basic mnemonic.) 

C2 Branch Mnemonics 

The mnemonics discussed in this section are variations of the Branch 
Conditional instructions. 

C.2.1 BO and BI Fields 

The 5-bit BO field in Branch Conditional instructions encodes the foUow- 
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ing operations: 

■ Decrement CTR 

■ Test CTR equal to 0 

■ Test CTR not equal to 0 

■ Test condition true 

■ Test condition false 

■ Branch prediction (taken, fall through) 

The 5-bit BI field in Branch Conditional instructions specifies which of 
the 32 bits in the CR represents the condition to test. 

To provide an extended mnemonic for every possible combination of 
BO and BI fields would require 2^^ = 1024 mnemonics. Most of these 
would be only marginally useful. The following abbreviated set is 
intended to cover the most useful cases. Unusual cases can be coded 
using a basic Branch Conditional mnemonic (be, belt, bcctr) with the 
condition to be tested specified as a numeric operand. 

C.2.2 Simple Branch Mnemonics 

The mnemonics in Table 2 on page 218 allow all the useful BO encodings 
to be specified, along with the AA (absolute address) and LK (set Link 
Register) fields. 

Notice that there are no extended mnemonics for relative and absolute 
unconditional branches. For these, the basic mnemonics fe, ba, bl, and 
bla should be used. 

Instructions using one of the mnemonics in Table 2 that tests a condi- 
tion specify the corresponding Condition Register bit as the first operand. 
The symbols defined in Section C.l can be used in this operand. 

Examples 

1 . Decrement CTR and branch if it is still nonzero (closure of a loop con- 
trolled by a count loaded into CTR). 

bdnz target (equivalent to: be 16,0,target) 
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LR not set 


LR set 


Branch semantics 


be 
Relative 


bca 
Absolute 


bclr 
ToLR 


bcctr 
To CTR 


bcl 
Relative 


bcla 
Absolute 


bclrl 
ToLR 


bcctrl 
To CTR 


Branch unconditionally 






blr 


bctr 






blrl 


bctrl 


Branch if condition true 


bt 


bta 


btlr 


btctr 


bd 


bda 


btlrl 


btctrl 


Branch if condition false 


bf 


bfa 


bflr 


bfctr 


bfl 


bfla 


bflrl 


bfctrl 


Decrement CTR, 
branch if CTR nonzero 


bdnz 


bdnza 


bdnzlr 


- 


bdnzl 


bdnzla 


bdnzlrl 


- 


Decrement CTR, 
branch if CTR nonzero 
AND condition true 


bdnzt 


bdnzta 


bdnztlr 


- 


bdnzd 


bdnztla 


bdnzdrl 


- 


Decrement CTR, 
branch if CTR nonzero 
AND condition false 


bdnzf 


bdnzfa 


bdnzflr 


- 


bdnzfl 


bdnzfla 


bdnzflrl 


- 


Decrement CTR, 
branch if CTR zero 


bdz 


bdza 


bdzlr 




bdzl 


bdzla 


bdzlrl 




Decrement CTR, 
branch if CTR zero 
AND condition true 


bdzt 


bdzta 


bdzdr 




bdzd 


bdzda 


bdzdrl 




Decrement CTR, 
branch if CTR zero 
AND condition false 


bdzf 


bdzfa 


bdzflr 




bdzfl 


bdzfla 


bdzflrl 





Table 2. Simple branch mnemonics 



2. Same as (1) but branch only if CTR is nonzero and condition in CRO 
is "equal." 

bdnzt eq,target (equivalent to: be 8,2,target) 

3. Same as (2), but "equal" condition is in CR5. 

bdnzt 4*cr5+eq,target (equivalent to: be 8,22,target) 

4- Branch if bit 27 of CR is false. 

bf 27,target (equivalent to: be 4,27,target) 

5. Same as (4), but set the Link Register. This is a form of conditional 
"call." 

bfl 27,target (equivalent to: bcl 4,27,target) 
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C.2.3 Branch Mnemonics Incorporating 
Conditions 

The mnemonics defined in Table 3 on page 220 are variations of the 
"branch if condition true" and "branch if condition false" BO encodings, 
with the most useful values of BI represented in the mnemonic rather 
than specified as a numeric operand. 

A standard set of codes has been adopted for the most common com- 
binations of branch conditions. 



Code 


Meaning 


It 


T pcc i"nciri 
i-jCao Liidii 


1 

le 


T ^1 1 

Less than or equal 


eq 


Equal 


ge 


Greater than or equal 


gt 


Greater than 


nl 


Not less than 


ne 


Not equal 


ng 


Not greater than 


so 


Summary overflov^ 


ns 


Not summary overflov^ 


un 


Unordered (after floating-point comparison) 


nu 


Not unordered (after floating-point comparison) 



These codes are reflected in the mnemonics shown in Table 3 on 
page 220. 

Instructions using the mnemonics in Table 3 specify the Condition 
Register field in an optional first operand. If the CR field being tested is 
CRO, this operand need not be specified. One of the CR field symbols 
defined in Section C.l can be used for this operand. 

Examples 

1. Branch if CRO reflects condition "not equal." 

bne target (equivalent to: be 4,2,target) 
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LR not set 


LR set 


Branch semantics 


be 
Relative 


bca 
Absolute 


bclr 
To LR 


bcctr 
To CTR 


bcl 
Relative 


bcla 
Absolute 


bclrl 
To LR 


bcctrl 
To CTR 


Branch if less than 


bit 


blta 


bltlr 


bltctr 


bltl 


bltla 


bltlrl 


bltctrl 


Branch if less than or equal 


ble 


blea 


blelr 


blectr 


blel 


blela 


blelrl 


blectrl 


Branch if equal 


beq 


beqa 


beqlr 


beqctr 


beql 


beqla 


beqlrl 


beqctrl 


Branch if greater than or equal 


bge 


bgea 


bgelr 


bgectr 


bgel 


bgela 


bgelrl 


bgectrl 


Branch if greater than 


bgt 


bgta 


bgtlr 


bgtctr 


bgtl 


bgtla 


bgdrl 


bgtctrl 


Branch if not less than 


bnl 


bnla 


bnllr 


bnlctr 


bnll 


bulla 


bnllrl 


bnlctrl 


Branch if not equal 


bne 


bnea 


bnelr 


bnectr 


bnel 


bnela 


bnelrl 


bnectrl 


Branch if not greater than 


bng 


bnga 


bnglr 


bngctr 


bngl 


bngla 


bnglrl 


bngctrl 


Branch if summary overflow 


bso 


bsoa 


bsolr 


bsoctr 


bsol 


bsola 


bsolrl 


bsoctrl 


Branch if not summary overflow 


bns 


bnsa 


bnslr 


bnsctr 


bnsl 


bnsla 


bnslrl 


bnsctrl 


Branch if unordered 


bun 


buna 


bunlr 


bunctr 


bunl 


bunla 


bunlrl 


bunctr 1 


Branch if not unordered 


bnu 


bnua 


bnulr 


bnuctr 


bnul 


bnula 


bnulrl 


bnuctrl 



Table 3. Branch mnemonics incorporating conditions 



2. Same as (1), but condition is in CR3. 

bne cr3,target (equivalent to: be 4,14,target) 

3. Branch to an absolute target if CR4 specifies "greater than," setting 
the Link Register. This is a form of conditional "call." 

bgtla cr4,target (equivalent to; bcla 12,17,target) 

4- Same as (3), but target address is in the Count Register. 

bgtctrl cr4 (equivalent to: bcctrl 12,17) 

C.2.4 Branch Prediction 

In Branch Conditional instructions that are not alw^ays taken, the low- 
order bit ("y" bit) of the BO field provides a hint about whether the 
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branch is likely to be taken: see the discussion of the "y" bit in Section 
2.4.1, "Branch Instructions," on page 35. 

Assemblers should set this bit to 0 unless otherwise directed. This 
default action means that: 

■ A Branch Conditional with a negative displacement field is predicted 
to be taken. 

■ A Branch Conditional with a nonnegative displacement field is pre- 
dicted not to be taken (fall through). 

■ A Branch Conditional to an address in the LR or CTR is predicted not 
to be taken (fall through). 

If the likely outcome (branch or fall through) of a given Branch Condi- 
tional instruction is known, a suffix can be added to the mnemonic that 
tells the assembler how to set the "y" bit. 

+ Predict branch to be taken. 

Predict branch not to be taken. 

Such a suffix can be added to any Branch Conditional mnemonic, 
either basic or extended. 

For relative and absolute branches {bc[l][a]), the setting of the "y" bit 
depends on whether the displacement field is negative or nonnegative. For 
negative displacement fields, coding the suffix causes the bit to be set 
to 0, and coding the suffix "-" causes the bit to be set to 1. For nonnega- 
tive displacement fields, coding the suffix "+" causes the bit to be set to 1, 
and coding the suffix causes the bit to be set to 0. 

For branches to an address in the LR or CTR (bclr[l] or bcctr[l]), cod- 
ing the suffix "+" causes the "y" bit to be set to 1, and coding the suffix 
"-" causes the bit to be set to 0. 

Examples 

1. Branch if CRO reflects condition "less than," specifying that the 
branch should be predicted to be taken. 

blt+ target 

2. Same as (1), but target address is in the Link Register and the branch 
should be predicted not to be taken. 

bltlr- 
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C.3 Condition Register Logical 
iVinemonics 

The Condition Register Logical instructions can be used to set (to 1), 
clear (to 0), copy, or invert a given Condition Register bit. Extended mne- 
monics are provided in Table 4 that allow these operations to be coded 
easily. 



Operation 


Extended mnemonic 


Equivalent to 


Condition Register set 


crset bx 


creqv bx,bx,bx 


Condition Register clear 


crclr bx 


crxor bxjbxjbx 


Condition Register move 


crmove bx,by 


cror bx,by,by 


Condition Register not 


crnot bxjby 


crnor bx,by,by 



Table 4. Condition Register logical mnemonics 

The symbols defined in Section C.l can be used to identify the Condi- 
tion Register bits. 

Examples 

1. Set CR bit 25. 

crset 25 (equivalent to: creqv 25,25,25) 

2. Clear the SO bit of CRO. 

crclr so (equivalent to: crxor 3,3,3) 
3- Same as (2), but SO bit to be cleared is in CR3. 

crclr 4*cr3+so (equivalent to: crxor 15,15,15) 

4. Invert the EQ bit. 

crnot eq,eq (equivalent to: crnor 2,2,2) 

5. Same as (4), but EQ bit to be inverted is in CR4, and the result is to be 
placed into the EQ bit of CR5. 



crnot 4*cr5+eq,4*cr4+eq (equivalent to: crnor 22,18,18) 
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C.4 Subtract Mnemonics 



C.4.1 Subtract Immediate 



Although there is no "Subtract Immediate" instruction, its effect can be 
achieved by using an Add Immediate instruction with the immediate 
operand negated. Extended mnemonics are provided that include this 
negation, making the intent of the computation clearer. 



subi Rx,Ry,value 

subis Rx,Ry,value 

subic Rx,Ry,value 

subic. Rx,Ry, value 



(equivalent to: addi 

(equivalent to: addis 

(equivalent to: addic 

(equivalent to: addic. 



Rx,Ry,-value) 
Rx,Ry,-value) 
Rx,Ry,-value) 
Rx,Ry,-value) 



C.4.2 Subtract 

The Subtract From instructions subtract the second operand (RA) from 
the third (RB). Extended mnemonics are provided that use the more 
"normal" order, in which the third operand is subtracted from the sec- 
ond. Both these mnemonics can be coded with a final "o" and/or "." to 
cause the OE and/or Rc bit to be set in the underlying instruction. 

sub Rx,Ry,Rz (equivalent to: subf Rx,Rz,Ry) 

subc Rx,Ry,Rz (equivalent to: subfc Rx,Rz,Ry) 



C.5 Compare Mnemonics 

The L field in the fixed-point Compare instructions controls whether the 
operands are treated as 64-bit quantities (L=l) or as 32-bit quantities 
(L=0). Extended mnemonics are provided that represent the L value in 
the mnemonic rather than requiring it to be coded as a numeric operand. 

The BF field can be omitted if the result of the comparison is to be 
placed in CR Field 0. Otherwise the target CR field must be specified as 
the first operand. One of the CR field symbols defined in Section C.l can 
be used for this operand. 
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Note: The basic Compare mnemonics of PowerPC are the same as those 
of Power, but the Power instructions have three operands while the 
PowerPC instructions have four. The assembler will recognize a basic 
Compare mnemonic with three operands as the Power form and will gen- 
erate the instruction with L=0. (Thus the assembler must require that the 
BF field, which normally can be omitted when CR Field 0 is the target, be 
specified explicitly if L is.) 



C5.1 Doubleword Comparisons 

These operations are available only in 64-bit implementations. 



Operation 


Extended mnemonic 


Equivalent to 


Compare doubleword immediate 


cmpdi bf,ra,si 


cmpi bf,l,ra,si 


Compare doubleword 


cmpd bf,ra,rb 


cmp bf,l,ra,rb 


Compare logical doubleword immediate 


cmpldi bf,ra,ui 


cmpli bf,l,ra,ui 


Compare logical doubleword 


cmpld bf,ra,rb 


cmpl bf,l,ra,rb 



Table 5. Doubleword compare mnemonics 



Examples 

1- Compare register Rx and immediate value 100 as unsigned 64-bit inte- 
gers and place result in CRO. 

cmpldi Rx,100 (equivalent to: cmpli 0,l,Rx,100) 

2. Same as (1), but place result in CR4. 

cmpldi cr4,Rx,100 (equivalent to: cmpli 

3. Compare registers Rx and Ry as signed 64-bit integers and place result 
in CRO. 

cmpd RxjRy (equivalent to: cmp 0,ljRx,Ry) 
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C.5.2 Word Comparisons 

These operations are available in all implementations. 



Operation 


Extended mnemonic 


Equivalent to 


Compare word immediate 


cmpwi bf,ra,si 


cmpi bf,0,ra,si 


Compare word 


cmpw bf,ra,rb 


cmp bf,0,ra,rb 


Compare logical word immediate 


cmplwi bf,ra,ui 


cmpli bf,0,ra,ui 


Compare logical word 


cmplw bf,ra,rb 


cmpl bf,0,ra,rb 



Table 6. Word compare mnemonics 
Examples 

1. Compare bits 32:63 of register Rx and immediate value 100 as signed 
32-bit integers and place result in CRO. 



cmpwi Rx,100 (equivalent to: cmpi 0,0,Rx,100) 

2. Same as (1), but place result in CR4. 

cmpv^i cr4,Rx,100 (equivalent to: cmpi 4,0,Rx,100) 

3- Compare bits 32:63 of registers Rx and Ry as unsigned 32-bit integers 
and place result in CRO. 

cmplw RxjRy (equivalent to: cmpl 0,0,Rx,Ry) 

C.6 Trap Mnemonics 

The mnemonics defined in Table 7 on page 227 are variations of the Trap 
instructions, with the most useful values of TO represented in the mne- 
monic rather than specified as a numeric operand. 

A standard set of codes has been adopted for the most common com- 
binations of trap conditions. 

These codes are reflected in the mnemonics shown in Table 7. 
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Code 


Meaning 


TO encoding 


< 


> 








It 


Less than 


16 


1 


0 


0 


0 


0 


le 


Less than or equal 


20 


1 


0 


1 


0 


0 


eq 


Equal 


4 


0 


0 


1 


0 


0 


ge 


Greater than or equal 


12 


0 


1 


1 


0 


0 


gt 


Greater than 


8 


0 


1 


0 


0 


0 


nl 


Not less than 


12 


0 


1 


1 


0 


0 


ne 


Not equal 


24 


1 


1 


0 


0 


0 


ng 


Not greater than 


20 


1 


0 


1 


0 


0 


lit 


Logically less than 


2 


0 


0 


0 


1 


0 


lie 


Logically less than or equal 


6 


0 


0 


1 


1 


0 


Ige 


Logically greater than or equal 


5 


0 


0 


1 


0 


1 


Igt 


Logically greater than 


1 


0 


0 


0 


0 


1 


Inl 


Logically not less than 


5 


0 


0 


1 


0 


1 


Ing 


Logically not greater than 


6 


0 


0 


1 


1 


0 


(none) 


Unconditional 


31 


1 


1 


1 


1 


1 



Examples 

1. Trap if register Rx is not 0. 

tdnei Rx,0 (equivalent to: tdi 24,Rx,0) 

2. Same as (1), but comparison is to register Ry. 

tdne Rx,Ry (equivalent to: td 24,Rx,Ry) 

3. Trap if bits 32:63 of register Rx, considered as a 32-bit quantity, are 
logically greater than 0x7FR 



twlgti Rx,0x7FF 
4- Trap unconditionally. 

trap 



(equivalent to: twi l,Rx,0x7FF) 



(equivalent to: tw 



31,0,0) 
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64-bit comparison 


32-bit comparison 


Trap semantics 


tdi 
Immediate 


td 

Register 


twi 
Immediate 


tw 
Register 


Trap unconditionally 




- trap 


Trap if less than 


tdlti 


tdlt 


twlti 


twit 


Trap if less than or equal 


tdlei 


tdle 


twlei 


twle 


Trap if equal 


tdeqi 


tdeq 


tweqi 


tweq 


Trap if greater than or equal 


tdgei 


tdge 


twgei 


twge 


Trap if greater than 


tdgti 


tdgt 


twgti 


twgt 


Trap if not less than 


tdnli 


tdnl 


twnli 


twnl 


Trap if not equal 


tdnei 


tdne 


twnei 


twne 


Trap if not greater than 


tdngi 


tdng 


twngi 


twng 


Trap if logically less than 


tdllti 


tdUt 


twUti 


twUt 


Trap if logically less than or equal 


tdllei 


tdlie 


twllei 


twUe 


Trap if logically greater than or equal 


tdlgei 


tdlge 


twlgei 


twlge 


Trap if logically greater than 


tdlgti 


tdlgt 


twlgti 


twlgt 


Trap if logically not less than 


tdlnli 


tdlnl 


twlnli 


twlnl 


Trap if logically not greater than 


tdlngi 


tdlng 


twlngi 


twlng 



Table 7. Trap mnemonics 



C.7 Rotate and Shift iVInemonics 

The Rotate and Shift instructions provide powerful and general ways to 
manipulate register contents, but can be difficult to understand. 
Extended mnemonics are provided that allow some of the simpler opera- 
tions to be coded easily. 

Mnemonics are provided for the following types of operation: 
Extract Select a field of n bits starting at bit position b in the source reg- 
ister; right or left justify this field in the target register; clear all 
other bits of the target register to 0. 

Insert Select a left- justified or right- justified field of n bits in the source 
register; insert this field starting at bit position b of the target 
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register; leave other bits of the target register unchanged. (No 
extended mnemonic is provided for insertion of a left- justified 
field when operating on doublewords, because such an insertion 
requires more than one instruction.) 

Rotate Rotate the contents of a register right or left n bits without mask- 
ing. 

Shift Shift the contents of a register right or left n bits, clearing vacated 
bits to 0 (logical shift). 

Clear Clear the leftmost or rightmost n bits of a register to 0. 

Clear left and shift left 

Clear the leftmost b bits of a register, then shift the register left 
by n bits. This operation can be used to scale a (known nonne- 
gative) array index by the width of an element. 

C.7.1 Operations on Doublewords 

These operations are available only in 64-bit implementations. All these 
mnemonics can be coded with a final "." to cause the Rc bit to be set in 
the underlying instruction. 

Examples 

1. Extract the sign bit (bit 0) of register Ry and place the result right-jus- 
tified into register Rx. 

extrdi Rx,Ry,l,0 (equivalent to: rldicl Rx,Ry,l,63) 

2. Insert the bit extracted in (1) into the sign bit (bit 0) of register Rz. 
insrdi RZ5RX5I5O (equivalent to: rldimi Rz,Rx,63,0) 

3. Shift the contents of register Rx left 8 bits. 

sldi Rx,Rx,8 (equivalent to: rldicr Rx,Rx, 8, 55) 

4. Clear the high-order 32 bits of register Ry and place the result into 
register Rx. 

clrldi Rx,Ry,32 (equivalent to: rldicl Rx,Ry,0,32) 
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Operation 


Extended mnemonic 


Equivalent to 


Extract and left justify immediate 


extldi ra,rs,«,/? {n > 0) 


rldicr ra,rs,^,w-l 


Extract and right justify immediate 


extrdi ra,rs,w,^ {n > 0) 


rldicl ra.,ics,b+n,64-n 


Insert from right immediate 


insrdi ra,rs,n,^ [n > 0) 


rldimi ra,rs,64-(6+«),6 


Rotate left immediate 


rotldi ra,rs,« 


rldicl ra,rs,w,0 


Rotate right immediate 


rotrdi ra,rs,w 


rldicl ra,rs,64-w,0 


Rotate left 


rotld ra,rs,rb 


rldcl ra,rs,rb,0 


Shift left immediate 


sldi ra,rs,« {n < 64) 


rldicr ra,rs,w,63-« 


Shift right immediate 


srdi ra,rs,« (« < 64) 


rldicl ra,rs,64-w,« 


Clear left immediate 


clrldi ra,rs,« {n < 64) 


rldicl ra,rs,0,« 


Clear right immediate 


clrrdi ra,rs,« {n < 64) 


rldicr ra,rs,0,63-« 


Clear left and shift left immediate 


clrlsldi ra,rs,fe,« {n<b < 64) 


rldic ra,rs,«j6-« 



Table 8. Doubleword rotate and sliift mnemonics 



C.7.2 Operations on Words 

These operations are available in all implementations. All these mnemon- 
ics can be coded with a final "." to cause the Rc bit to be set in the under- 
lying instruction. The operations as described above apply to the lovsr- 
order 32 bits of the registers, as if the registers were 32-bit registers. The 
Insert operations either preserve the high-order 32 bits of the target regis- 
ter or place rotated data there; the other operations clear these bits. 

Examples 

1. Extract the sign bit (bit 32) of register Ry and place the result right- 
justified into register Rx. 

extrwi Rx,Ry,l,0 (equivalent to: rlwinm Rx,Ry, 1,3 1,31) 

2. Insert the bit extracted in (1) into the sign bit (bit 32) of register Rz. 
insrwi Rz,Rx,l,0 (equivalent to: rlwimi Rz,Rx,31,0,0) 
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Operation 


Extended mnemonic 


Equivalent to 


Extract and left justify immediate 


extlwi ra,rs,w,/? 


{n>0) 


rlwinm ra,rs,^, 0,^-1 


Extract and right justify immediate 


extrwi ra,rs,w,6 


{n>0) 


rlwinm ra,rs,/?+;^,32-w,31 


Insert from left immediate 


inslwi ra,rs,w,6 


{n>0) 


rlwimi ra,rs,32-^,^,(^+w)-l 


Insert from right immediate 


insrwi ra,rs,«,^ 


{n>0) 


rlwimi ra,rs,32-(^+w),6,(^+w)-l 


Rotate left immediate 


rotlwi ra,rs,w 


rlwinm ra,rs,fz,0,31 


Rotate right immediate 


rotrwi ra,rs,w 


rlwinm ra,rs,32-w,0,31 


Rotate left 


rotlw ra,rs,rb 


rlwnm ra,rs,rb,0,31 


Shift left immediate 


slwi ra,rs,« 


{n < 32) 


rlwinm ra,rs,«,0,31-^ 


Shift right immediate 


srwi ra,rs,« 


{n < 32) 


rlwinm ra,rs,32-w,«,31 


Clear left immediate 


clrlwi ra,rs,« 


{n < 32) 


rlwinm ra,rs,0,w,31 


Clear right immediate 


clrrwi ra,rs,« 


{n<32) 


rlwinm ra,rs,0,0,31-« 


Clear left and shift left immediate 


clrlslwi r3L,rs,b,n 


{n<b< 32) 


rlwinm ra,rs,n,b-n,31-n 



Table 9. Word rotate and shift mnemonics 



3- Shift the contents of register Rx left 8 bits, clearing the high-order 32 
bits. 

slwi Rx,Rx,8 (equivalent to: rlwinm Rx,Rx,8fi,23) 

4. Clear the high-order 16 bits of the low-order 32 bits of register Ry and 
place the result into register Rx, clearing the high-order 32 bits of reg- 
ister Rx. 

clrlwi Rx,Ry,16 (equivalent to: rlwinm Rx,Ry,0, 16,31) 

C.8 Move To/From Special Purpose 
Register iVinemonics 

The mtspr and ntfspr instructions specify a Special Purpose Register 
(SPR) as a numeric operand. Extended mnemonics are provided that rep- 
resent the SPR in the mnemonic rather than requiring it to be coded as an 
operand. 
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Special Purpose Register 


Move To SPR 


Move From SPR 


Extended 


Equivalent to 


Extended 


Equivalent to 


Fixed-Point Exception Register (XER) 


mtxer Rx 


mtspr l,Rx 


mfxer Rx 


mfspr Rx,l 


Link Register (LR) 


mtlr Rx 


mtspr 8,Rx 


mflr Rx 


mfspr Rx,8 


Count Register (CTR) 


mtctr Rx 


mtspr 9,Rx 


mfctr Rx 


mfspr Rx,9 



Table 10. Extended mnemonics for moving to/from an SPR 
Examples 

1. Copy the contents of the low-order 32 bits of register Rx to the XER. 



mtxer Rx (equivalent to: mtspr l,Rx) 

2. Copy the contents of the LR to register Rx. 

mflr Rx (equivalent to: mfspr Rx,8) 

3. Copy the contents of register Rx to the CTR. 

mtctr Rx (equivalent to: mtspr 9,Rx) 

C9 Miscellaneous Mnemonics 

No-op 

Many PowerPC instructions can be coded in a way such that, effectively, 
no operation is performed. An extended mnemonic is provided for the 
"preferred" form of no-op. If an implementation performs any type of 
runtime optimization related to no-ops, the preferred form is the no-op 
that will trigger this. 

nop (equivalent to: ori 0,0,0) 

Load immediate 

The addi and addis instructions can be used to load an immediate value 
into a register. Extended mnemonics are provided to convey the idea that 
no addition is being performed but merely data movement (from the 
immediate field of the instruction to a register). 
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Load a 16-bit signed immediate value into register Rx: 

li Rx, value (equivalent to: addi Rx,0,value) 

Load a 16-bit signed immediate value, shifted left by 16 bits, into reg- 
ister Rx: 

lis Rx, value (equivalent to: addis Rx,0,value) 

Load Address 

This mnemonic permits computing the value of a base-displacement 
operand, using the addi instruction which normally requires separate reg- 
ister and immediate operands. 

la Rx,D(Ry) (equivalent to: addi Rx,Ry,D) 

The la mnemonic is useful for obtaining the address of a variable spec- 
ified by name, allowing the assembler to supply the base register number 
and compute the displacement. If the variable v is located at offset Dv 
bytes from the address in register Rv, and the assembler has been told to 
use register Rv as a base for references to the data structure containing v, 
then the following line causes the address of v to be loaded into register 
Rx. 

la Rx,v (equivalent to: addi Rx,Rv,Dv) 

Move Register 

Several PowerPC instructions can be coded in a way such that they sim- 
ply copy the contents of one register to another. An extended mnemonic 
is provided to convey the idea that no computation is being performed 
but merely data movement (from one register to another). 

The following instruction copies the contents of register Ry into regis- 
ter Rx. This mnemonic can be coded with a final to cause the Rc bit 
to be set in the underlying instruction. 

mr Rx,Ry (equivalent to: or Rx,Ry,Ry) 

Complement Register 

Several PowerPC instructions can be coded in a way such that they com- 
plement the contents of one register and place the result into another reg- 
ister. An extended mnemonic is provided that allows this operation to be 
coded easily. 

The following instruction complements the contents of register Ry and 
places the result into register Rx. This mnemonic can be coded with a 
final "." to cause the Rc bit to be set in the underlying instruction. 

not Rx,Ry (equivalent to: nor Rx,Ry,Ry) 
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It is computed that eleven Thousand Persons have, at several Times, suf- 
fered Death, rather than submit to break their Eggs at the smaller End. 
Many hundred large Volumes have been published upon this 
Controversy .... 



If scalars (individual data items and instructions) were indivisible, then 
there would be no such concept as "byte ordering". It is meaningless to 
talk of the "order" of bits or groups of bits within the smallest address- 
able unit of storage, because nothing can be observed about such order. 
Only when scalars, which the programmer and processor regard as indi- 
visible quantities, can be made up of more than one addressable unit of 
storage does the question of "order" arise. 

For a machine in which the smallest addressable unit of storage is the 
64-bit doubleword, there is no question of the ordering of "bytes" within 
doublewords. All transfers of individual scalars to and from storage 
(e.g., between registers and storage) are of doublewords, and the address 
of the "byte" containing the high-order 8 bits of a scalar is no different 
from the address of a "byte" containing any other part of the scalar. 

For PowerPC, as for most computers currently available, the smallest 
addressable unit of storage is the 8-bit byte. Many scalars are halfwords, 
words, or doublewords, which consist of groups of bytes. When a word- 
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length scalar is moved from a register to storage, the scalar occupies four 
consecutive byte addresses. It thus becomes meaningful to discuss the 
order of the byte addresses with respect to the value of the scalar: v^hich 
byte contains the highest-order 8 bits of the scalar, which byte contains 
the next-highest-order 8 bits, and so on. 

Given a scalar that spans multiple bytes, the choice of byte ordering is 
essentially arbitrary. There are 4! = 24 ways to specify the ordering of 
four bytes within a word, but only two of these orderings are sensible: 

■ The ordering that assigns the lowest address to the highest-order 
("leftmost") 8 bits of the scalar, the next sequential address to the 
next-highest-order 8 bits, and so on. This is called Big-Endian because 
the "big end" of the scalar, considered as a binary number, comes first 
in storage. IBM RISC System/6000, IBM System/370, and Motorola 
680x0 are examples of computers using this byte ordering. 

■ The ordering that assigns the lowest address to the lowest-order 
("rightmost") 8 bits of the scalar, the next sequential address to the 
next-lowest-order 8 bits, and so on. This is called Little-Endian 
because the "little end" of the scalar, considered as a binary number, 
comes first in storage. DEC VAX and Intel x86 are examples of com- 
puters using this byte ordering. 

D.2 Structure Mapping Examples 

Figure 35 on page 235 shows an example of a C language structure s 
containing an assortment of scalars and one character string. The value 
assumed to be in each structure element is shown in hex in the C com- 
ments; these values are used below to show how the bytes making up 
each structure element are mapped into storage. 

C structure mapping rules permit the use of padding (skipped bytes) in 
order to align the scalars on desirable boundaries. Figures 36 and 37 
show each scalar aligned at its natural boundary. This alignment intro- 
duces padding of four bytes between a and b, one byte between d and e, 
and two bytes between e and f. The same amount of padding is present 
for both Big-Endian and Little-Endian mappings. 
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struct { 



1 nt 


a ; 




0xlll2_ 


_1314 




word*/ 


doubl e 


b; 




0x2122 


2324 


_2526_2728 


doubleword */ 


char * 


c ; 




0x3132_ 


_3334 




word*/ 


char 


d[7]; 


/* 


'A' , "B 


. 'C 


'D'.'E'.'F'.'G' 


array of byte 


short 


e ; 


/* 


0x5152 






hal fword*/ 


i nt 


f ; 


/* 


0x6162_ 


_6364 




word*/ 



} s; 



Figure 35. C structure 's', showing values of elements 

D.2.1 Big-Endian Mapping 

The Big-Endian mapping of structure s is shown in Figure 36. Addresses 
are shown in hex at the left of each doubleword, and below each byte. 
The content of each byte, as indicated in the C example in Figure 35, is 
shown in hex (as characters for the elements of the string). 
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00 
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08 
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26 


27 
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08 


09 
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10 


31 


32 
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*B' 
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T' 


*G' 




51 
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IE 


IF 


20 


61 
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22 


23 











Figure 36. Big-Endian mapping of structure 's' 



D.2.2 Little-Endian Mapping 

The same structure s is shown mapped Little-Endian in Figure 37. Dou- 
blewords are shown laid out from right to left, which is the common way 
of showing storage maps for Little-Endian machines. 
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11 


12 


13 


14 


00 


07 


06 


05 


04 


03 


02 


01 


00 




21 
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24 
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26 


27 


28 
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OF 
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OC 
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OA 


09 


08 




'D' 


*C 






31 
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34 


10 


17 
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11 
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51 


52 
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20 










23 
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21 


20 





Figure 37. Littie-Endian mapping of structure 's' 

D.3 PowerPC Byte Ordering 

The body of each of the three PowerPC Architecture books: Book I, 
PowerPC User Instruction Set Architecture, Book II, PowerPC Virtual 
Environment Architecture, and Book III, PowerPC Operating Environ- 
ment Architecture, is written as if a PowerPC system runs only in Big- 
Endian mode. In fact, a PowerPC system can instead run in Littie-Endian 
mode, in which the instruction set behaves as if the byte ordering were 
Littie-Endian, and can change Endian mode dynamically. The remainder 
of this appendix describes how the mode is controlled, and how running 
in Littie-Endian mode differs from running in Big-Endian mode. 

D.3.1 Controlling PowerPC Byte Ordering 

The Endian mode of a PowerPC processor is controlled by two bits: the 
LE (Littie-Endian Mode) bit specifies the current mode of the processor, 
and the ILE (Interrupt Littie-Endian Mode) bit specifies the mode that the 
processor enters when the system error handler is invoked. For both bits, 
a value of 0 specifies Big-Endian mode and a value of 1 specifies Littie- 
Endian mode. The location of these bits and the requirements for altering 
them are described in Book III, Section 2.2.3, "Machine State Register," 
on page 374 and Chapter 7, "Synchronization Requirements for Special 
Registers and for Lookaside Buffers," on page 483. 

When a PowerPC system comes up after power-on-reset, Big-Endian 
mode is in effect. Thereafter, methods described in Book III can be used 
to change the mode, as can both invoking the system error handler and 
returning from the system error handler. 
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D.3.2 PowerPC Little-Endian Byte Ordering 

One might expect that a PowerPC system running in Little-Endian mode 
would have to perform a 2-way, 4-way, or 8-way byte swap when trans- 
ferring a halfword, word, or doubleword to or from storage, e.g., when 
transferring data between storage and a general purpose register or float- 
ing-point register, when fetching instructions, and when transferring data 
between storage and an Input/Output (I/O) device. PowerPC systems do 
not do such swapping, but instead achieve the effect of Little-Endian byte 
ordering by modifying the low-order three bits of the effective address 
(EA) as described below. Individual scalars actually appear in storage in 
Big-Endian byte order. 

The modification affects only the addresses presented to the storage 
subsystem (see Book III, PowerPC Operating Environment Architecture). 
All effective addresses in architecturally defined registers, as well as the 
Current Instruction Address (CIA) and Next Instruction Address (NIA), 
are independent of Endian mode. For example: 

■ The effective address placed into the Link Register by a Branch instruc- 
tion with LK=1 is equal to the CIA of the Branch instruction + 4; 

■ The effective address placed into RA by a Load/Store with Update 
instruction is the value computed as described in the instruction 
description; and 

■ The effective addresses placed into System Registers when the system 
error handler is invoked (e.g., SRRO, DAR: see Book III, PowerPC 
Operating Environment Architecture) are those that were computed 
or would have been computed by the interrupted program. 

The modification is independent of the address translation mechanism, 
and is performed regardless of whether translation is enabled or disabled, 
whether the accessed storage is in an ordinary segment, a direct-store seg- 
ment, or a BAT area, etc. (see Book III, PowerPC Operating Environment 
Architecture). The actual transfer of data and instructions to and from 
storage is unaffected (and thus unencumbered by multiplexors for byte 
swapping). 

The modification of the low-order three bits of the effective address in 
Little-Endian mode is done as follows, for access to an individual aligned 
scalar. (Alignment is as determined before this modification.) Access to 
an individual unaligned scalar or to multiple scalars is described in subse- 
quent sections, as is access to certain architecturally defined data in stor- 
age, data in caches (see Book II, PowerPC Virtual Environment 
Architecture^ and Book III, PowerPC Operating Environment Architec- 
ture)^ etc. 
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In Little-Endian mode, the effective address is computed in the same 
way as in Big-Endian mode. Then, in Little-Endian mode only, the low- 
order three bits of the effective address are Exclusive ORed with a three- 
bit value that depends on the length of the operand (1, 2, 4, or 8 bytes), 
as shown in Table 11. This modified effective address is then passed to 
the storage subsystem, and data of the specified length are transferred to 
or from the addressed (as modified) storage locations(s). 



Data length (bytes) 


EA modification: 


1 


XOR with Obi 11 


2 


XOR with Obi 10 


4 


XOR with OblOO 


8 


(no change) 



Table 1 1 . PowerPC Little-Endian, effective address modification for individ- 
ual aligned scalars 



The effective address modification makes it appear to the processor 
that individual aligned scalars are stored Little-Endian, while in fact they 
are stored Big-Endian but in different bytes within doublewords from the 
order in which they are stored in Big-Endian mode. 

For example, in Little-Endian mode structure s would be placed in 
storage as follows, from the point of view of the storage subsystem (i.e., 
after the effective address modification described above). 
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Figure 38. PowerPC Little-Endian, structure 's' in storage subsystem 

Figure 38 is identical to Figure 37 except that the byte numbers within 
each doubleword are reversed. (This identity is in some sense an artifact 
of depicting storage as a sequence of doublewords. If storage is instead 



Book I PowerPC User Instruction Set Architecture 



D.3 PowerPC Byte Ordering 



239 



depicted as a sequence of words, a single byte stream, etc., then no such 
identity appears. However, regardless of the unit in which storage is 
depicted or accessed, the address of a given byte in Figure 38 differs from 
the address of the same byte in Figure 37 only in the low-order three bits, 
and the sum of the two 3 -bit values that comprise the low-order three bits 
of the two addresses is equal to 7. Depicting storage as a sequence of 
doublewords makes this relationship easy to see.) 

Because of the modification performed on effective addresses, struc- 
ture s appears to the processor to be mapped into storage as follows 
when the processor is in Little-Endian mode. 
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Figure 39. PowerPC Little-Endian, structure 's' as seen by processor 

Notice that, as seen by the program executing in the processor, the 
mapping for structure s is identical to the Little-Endian mapping shown 
in Figure 37. From a point of view outside the processor, however, the 
addresses of the bytes making up structure s are as shown in Figure 38. 
These addresses match neither the Big-Endian mapping of Figure 36 nor 
the Little-Endian mapping of Figure 37; allowance must be made for this 
in certain circumstances (e.g., when performing I/O: see Section D.7). 

The following four sections describe in greater detail the effects of run- 
ning in Little-Endian mode on accessing data storage, on fetching instruc- 
tions, on explicitly accessing the caches, the Segment Lookaside Buffer, 
and the Translation Lookaside Buffer (see Book II, PowerPC Virtual 
Environment Architecture, and Book III, PowerPC Operating Environ- 
ment Architecture), and on doing I/O. 



Book I PowerPC User Instruction Set Architecture 



240 



Appendix D Little-Endian Byte Ordering 



D.4 PowerPC Data Storage Addressing 
in Little-Endian iViode 

D.4.1 Individual Aligned Scalars 

When the storage operand is aHgned for any instruction in the following 
classes, the effective address presented to the storage subsystem is com- 
puted as described in Section D.3.2: Fixed-Point Load, Fixed-Point Store, 
Load and Store with Byte Reversal, Storage Synchronization (excluding 
sync), Floating-Point Load, and Floating-Point Store (including stfiwx). 

The Load and Store with Byte Reversal instructions have the effect of 
loading or storing data in the opposite Endian mode from that in which 
the processor is running. That is, data are loaded or stored in Little- 
Endian order if the processor is running in Big-Endian mode, and in Big- 
Endian order if the processor is running in Little-Endian mode. 

D.4.2 other Scalars 

As described below, the system alignment error handler may be (see 
"Individual Unaligned Scalars") or is (see "Multiple Scalars," on 
page 241) invoked if attempt is made in Little-Endian mode to execute 
any of the instructions described in the following two subsections. 

Individual Unaligned Scalars 

The "trick" of Exclusive ORing the low-order three bits of the effective 
address of an individual scalar does not work unless the scalar is aligned. 
In Little-Endian mode, PowerPC processors may cause the system align- 
ment error handler to be invoked whenever any of the Load or Store 
instructions listed in Section D.4.1 is issued with an unaligned effective 
address, regardless of whether such an access could be handled without 
invoking the system alignment error handler in Big-Endian mode. 

PowerPC processors are not required to invoke the system alignment 
error handler when an unaligned access is attempted in Little-Endian 
mode. The implementation may handle some or all such accesses without 
invoking the system alignment error handler, just as in Big-Endian mode. 
The architectural requirement is that halfwords, words, and doublewords 
be placed in storage such that the Little-Endian effective address of the 
lowest-order byte is the effective address computed by the Load or Store 
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instruction, the Littie-Endian address of the next-lowest-order byte is one 
greater, and so on. (Iwarx, Idarx, stwcx,, and stdcx, differ somewhat 
from the rest of the instructions listed in Section D.4.1, in that neither the 
implementation nor the system alignment error handler is expected to 
handle these four instructions "correctly" if their operands are not 
aligned.) 

Figure 40 shows an example of a word w stored at Littie-Endian 
address 5. The word is assumed to contain the binary value 
0xlll2_1314. 
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Figure 40. Littie-Endian mapping of word 'w' stored at address 5 

In Littie-Endian mode word w would be placed in storage as follows, 
from the point of view of the storage subsystem (i.e., after the effective 
address modification described in Section D.3.2). 
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Figure 41 . PowerPC Littie-Endian, word 'w' stored at address 5 in storage 
subsystem 

Notice that the unaligned word w in Figure 41 spans two double- 
words. The two parts of the unaligned word are not contiguous as seen 
by the storage subsystem. 

An implementation may choose to support some but not all unaligned 
Littie-Endian accesses. For example, an unaligned Littie-Endian access 
that is contained within a single doubleword may be supported, while 
one that spans doublewords may cause the system alignment error han- 
dler to be invoked. 



Multiple Scalars 

PowerPC has two classes of instructions that handle multiple scalars, 
namely the Load and Store Multiple instructions and the Move Assist 
instructions. Because both classes of instructions potentially deal with 



Programming Note 

If the system alignment 
error handler is invoked 
because one of the 
instructions described in 
Section D.4.2 is executed 
when the processor is in 
Littie-Endian mode, 
system software must 
decide whether to 
emulate the instruction 
and then resume the 
program, or treat the 
instruction as an illegal 
instruction and terminate 
the program. 

Littie-Endian mode 
programs on PowerPC 
are of necessity new (not 
old POWER binaries). It is 
probably best for the 
compiler not to generate 
these instructions in 
Littie-Endian mode, since 
emulation would be 
slower than using a series 
of aligned Load or Store 
instructions, either in-line 
or in a subroutine. An 
exception is the case of 
accessing an individual 
scalar (see "Individual 
Unaligned Scalars," on 
page 240) when the 
alignment is not known 
by the compiler but the 
operand is expected 
usually to be aligned: in 
this case it may be better 
for the compiler to 
generate the individual 
Load or Store 
instruction, and let the 
system alignment error 
handler, if invoked, 
emulate the instruction 
if the operand is in fact 
unaligned. 
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more than one word-length scalar, neither class is amenable to the effec- 
tive address modification described in Section D.3.2 (e.g., pairs of aligned 
words would be accessed in reverse order from what the program would 
expect). Attempting to execute any of these instructions in Little-Endian 
mode causes the system ahgnment error handler to be invoked. 



D.4.3 Segment Tables and Page Tables 

The layout of Segment Tables and Page Tables in storage (see Book III, 
Chapter 4, "Storage Control," on page 391) is independent of Endian 
mode. A given byte in one of these tables must be accessed using an 
effective address appropriate to the mode of the executing program (e.g., 
the high-order byte of a Page Table Entry must be accessed with an effec- 
tive address ending with ObOOO in Big-Endian mode, and with an effec- 
tive address ending with Obi 11 in Little-Endian mode). 



D.5 PowerPC instruction Storage 
Addressing in Little-Endian Mode 

Each PowerPC instruction occupies an aligned word in storage. The pro- 
cessor fetches and executes instructions as if the CIA were advanced by 
four for each sequentially fetched instruction. When the processor is in 
Little-Endian mode, the effective address presented to the storage sub- 
system in order to fetch an instruction is the value from the CIA, modi- 
fied as described in Section D.3.2 for aligned word-length scalars. A 
Little-Endian program is thus an array of aligned Little-Endian words, 
with each word fetched and executed in order (discounting branches and 
invocations of the system error handler). 

Figure 42 shows an example of a small assembly language program p. 



1 oop: 



done: 



cmplwi r5,0 

beq done 

Iwzux r4,r5,r6 

add r7,r7,r4 

subi r5,r5,4 

b 1 oop 

stw r7, total 



Figure 42. Assembly language program 'p' 
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The Big-Endian mapping for program p is shown in Figure 43 (assum- 
ing the program starts at address 0). 
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Figure 43. Big-Endian mapping of program 'p' 



The same program p is shown mapped Littie-Endian in Figure 44. 
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Figure 44. Littie-Endian mapping of program 'p' 

In Littie-Endian mode program p would be placed in storage as fol- 
lows, from the point of view of the storage subsystem (i.e., after the effec- 
tive address modification described in Section D.3.2). 
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Figure 45. PowerPC Littie-Endian, program 'p' in storage subsystem 



Figure 45 is identical to Figure 44 except that the byte numbers within 
each doubleword are reversed. (This identity is in some sense an artifact 
of depicting storage as a sequence of doublewords. If storage is instead 
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Programming Note 

In general, a given 
subroutine in storage 
cannot be shared 
between programs 
running in different 
Endian modes. This 
affects the sharing of 
subroutine libraries. 



depicted as a sequence of words, a single byte stream, etc., then no such 
identity appears. However, regardless of the unit in which storage is 
depicted or accessed, the address of a given byte in Figure 45 differs from 
the address of the same byte in Figure 44 only in the low-order three bits, 
and the sum of the two 3 -bit values that comprise the low-order three bits 
of the two addresses is equal to 7. Depicting storage as a sequence of 
doublewords makes this relationship easy to see.) 

Each individual machine instruction appears in storage as a 32-bit 
integer containing the value described in the instruction description, 
regardless of the Endian mode. This is a consequence of the fact that indi- 
vidual aligned scalars are mapped in storage in Big-Endian byte order. 

Notice that, as seen by the processor when executing program p, the 
mapping for program p is identical to the Little-Endian mapping shown 
in Figure 44. From a point of view outside the processor, however, the 
addresses of the bytes making up program p are as shown in Figure 45. 
These addresses match neither the Big-Endian mapping of Figure 43 nor 
the Little-Endian mapping of Figure 44. 

All instruction effective addresses visible to an executing program are 
the effective addresses that are computed by that program or, in the case 
of the system error handler, effective addresses that were or could have 
been computed by the interrupted program. These effective addresses are 
independent of Endian mode. Examples for Little-Endian mode include 
the following: 

■ An instruction address placed in the Link Register by a Branch instruc- 
tion with LK=1, or an instruction address saved in a System Register 
when the system error handler is invoked, is the effective address that 
a program executing in Little-Endian mode would use to access the 
instruction as a data word using a Load instruction. 

■ An offset in a relative Branch instruction (Branch or Branch Condi- 
tional with AA=0) reflects the difference between the addresses of the 
branch and target instructions, using the addresses that a program 
executing in Little-Endian mode would use to access the instructions 
as data words using Load instructions. 

■ A target address in an absolute Branch instruction (Branch or Branch 
Conditional with AA=1) is the address that a program executing in 
Little-Endian mode would use to access the target instruction as a data 
word using a Load instruction. 

■ The storage locations that contain the first set of instructions executed 
by each kind of system error handler must be set in a manner consis- 
tent with the Endian mode in which the system error handler will be 
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invoked. (These sets of instructions occupy architecturally defined 
locations: see Book III, Chapter 5, "Interrupts," on page 453.) Thus if 
the system error handler is to be invoked in Little-Endian mode, the 
first set of instructions for each kind of system error handler must 
appear in storage, from the point of view of the storage subsystem 
(i.e., after the effective address modification described in Section 
D.3.2), with the pairs of instructions within each doubleword reversed 
from the order in which they are to be executed. (If the instructions are 
placed into storage by a program running in the same Endian mode as 
that in which the system error handler will be invoked, the appropriate 
order will be achieved naturally.) 

D.6 PowerPC Cache Management and 
Lookaside Buffer Management 
Instructions in Little-Endian Mode 

The instructions for explicitly accessing the caches, Segment Lookaside 
Buffer, and Translation Lookaside Buffer (see Book II, PowerPC Virtual 
Environment Architecture^ and Book III, PowerPC Operating Environ- 
ment Architecture) are unaffected by Endian mode. (Identification of the 
block. Segment Table Entry, or Page Table Entry to be accessed is not 
affected by the low-order three bits of the effective address.) 

D.7 PowerPC I/O in Little-Endian Mode 

Input/output (I/O), such as writing the contents of a large area of storage 
to disk, transfers a byte stream on both Big-Endian and Little-Endian sys- 
tems. For the disk transfer, the first byte of the area is written to the first 
byte of the disk record and so on. 

For a PowerPC system running in Big-Endian mode, I/O transfers hap- 
pen "naturally" because the byte that the processor sees as byte 0 is the 
same one that the storage subsystem sees as byte 0. 

For a PowerPC system running in Little-Endian mode this is not the 
case, because of the modification of the low-order three bits of the effec- 
tive address when the processor accesses storage. In order for I/O trans- 
fers to transfer byte streams properly, in Little-Endian mode I/O transfers 
must be performed as if the bytes transferred were accessed one byte at a 
time, using the address modification described in Section D.3.2 for single- 
byte scalars. This does not mean that I/O on Little-Endian PowerPC 
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systems must use only 1 -byte-wide transfers; data transfers can be as 
wide as desired, but the order of the bytes transferred within double- 
words must appear as if the bytes were fetched or stored one byte at a 
time. See the System Architecture documentation for a given PowerPC 
system for details on the transfer width and byte ordering on that system. 

However, not all I/O done on PowerPC systems is for large areas of 
storage as described above. I/O can be performed with certain devices 
merely by storing to or loading from addresses that are associated with 
the devices (the terms "memory-mapped I/O" and "programmed I/O" or 
"PIO" are used for this). For such PIO transfers, care must be taken 
when defining the addresses to be used, for these addresses are subject to 
the effective address modification shown in Table 11 on page 238. A 
Load or Store instruction that maps to a control register on a device may 
require that the value loaded or stored have its bytes reversed; if this is 
required, the Load and Store with Byte Reversal instructions can be used. 
Any requirement for such byte reversal for a particular I/O device register 
is independent of whether the PowerPC system is running in Big-Endian 
or Little-Endian mode. 

Similarly, the address sent to an I/O device by an eciwx or ecowx 
instruction (see Book III, Section A.l, "External Control," on page 489) 
is subject to the effective address modification shown in Table 11. 

D.8 Origin of Endian 

The terms Big-Endian and Little-Endian come from Part I, Chapter 4, of 
Jonathan Swift's Gulliver s Travels, Here is the complete passage, from 
the edition printed in 1734 by George Faulkner in Dublin. 

. . . our Histories of six Thousand Moons make no Mention of any 
other Regions, than the two great Empires of Lilliput and Blefuscu, 
Which two mighty Powers have, as I was going to tell you, been 
engaged in a most obstinate War for six and thirty Moons past. It 
began upon the following Occasion. It is allowed on all Hands, 
that the primitive Way of breaking Eggs before we eat them, was 
upon the larger End: But his present Majesty's Grand-father, while 
he was a Boy, going to eat an Egg, and breaking it according to the 
ancient Practice, happened to cut one of his Fingers. Whereupon 
the Emperor his Father, published an Edict, commanding all his 
Subjects, upon great Penalties, to break the smaller End of their 
Eggs. The People so highly resented this Law, that our Histories tell 
us, there have been six Rebellions raised on that Account; wherein 
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one Emperor lost his Life, and another his Crown. These civil 
Commotions were constantly fomented by the Monarchs of Ble- 
fuscu; and when they were quelled, the Exiles always fled for Ref- 
uge to that Empire. It is computed that eleven Thousand Persons 
have, at several Times, suffered Death, rather than submit to break 
their Eggs at the smaller End. Many hundred large Volumes have 
been pubUshed upon this Controversy: But the Books of the Big- 
Endians have been long forbidden, and the whole Party rendered 
incapable by Law of holding Employments. During the Course of 
these Troubles, the Emperors of Blefuscu did frequently expostulate 
by their Ambassadors, accusing us of making a Schism in Religion, 
by offending against a fundamental Doctrine of our great Prophet 
Lustrog, in the fifty-fourth Chapter of the Brundrecal, (which is 
their Alcoran.) This, however, is thought to be a mere Strain upon 
the text: For the Words are these; That all true Believers shall break 
their Eggs at the convenient End: and which is the convenient End, 
seems, in my humble Opinion, to be left to every Man's Con- 
science, or at least in the Power of the chief Magistrate to deter- 
mine. Now the Big-Endian Exiles have found so much Credit in 
the Emperor of Blefuscu' s Court; and so much private Assistance 
and Encouragement from their Party here at home, that a bloody 
War has been carried on between the two Empires for six and thirty 
Moons with various Success; during which Time we have lost Forty 
Capital Ships, and a much greater Number of smaller Vessels, 
together with thirty thousand of our best Seamen and Soldiers; and 
the Damage received by the Enemy is reckoned to be somewhat 
greater than ours. However, they have now equipped a numerous 
Fleet, and are just preparing to make a Descent upon us: and his 
Imperial Majesty, placing great Confidence in your Valour and 
Strength, hath commanded me to lay this Account of his Affairs 
before you. 
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E.1 Synchronization 

This section gives examples of how the Storage Synchronization instruc- 
tions can be used to emulate various synchronization primitives and to 
provide more complex forms of synchronization. 

These examples have a common form. After possible initialization, 
there is a "conditional sequence" that begins with a Load And Reserve 
instruction, which may be followed by memory accesses and/or computa- 
tion that include neither a Load And Reserve nor a Store Conditional, 
and ends with a Store Conditional instruction with the same target 
address as the initial Load And Reserve, In most of the examples, failure 
of the Store Conditional causes a branch back to the Load And Reserve 
for a repeated attempt. On the assumption that contention is low, the 
conditional branch in the examples is optimized for the case in which the 
Store Conditional succeeds, by setting the branch-prediction bit appropri- 
ately. This is done by appending a minus sign to the instruction mne- 
monic, as described in Section C.2.4, "Branch Prediction," on page 220. 
These examples focus on techniques for the correct modification of 
shared storage locations: see Note 4 in Section E.1.4 for a discussion of 
how the retry strategy can affect performance. 

The Load And Reserve and Store Conditional instructions depend on 
the coherence mechanism of the system. Stores to a given location are 
coherent if they are serialized in some order, and no processor is able to 
observe a subset of those stores as occurring in a conflicting order. See 
Book II, Section 1.5, "Memory Coherence," on page 323 for additional 
details. 



250 



Appendix E Programming Examples 



Each load operation, whether ordinary or Load And Reserve, returns 
a value that has a well-defined source. The source can be the Store or 
Store Conditional instruction that wrote the value, an operation by some 
other mechanism that accesses storage (e.g., an I/O device), or the initial 
state of storage. 

The function of an atomic read/modify/write operation is to read a 
location and write its next value, possibly as a function of its current 
value, all as a single atomic operation. We assume that locations accessed 
by read/modify/write operations are accessed coherently, so the concept 
of a value being the next in the sequence of values for a location is well 
defined. The conditional sequence, as defined above, provides the effect 
of an atomic read/modify/write operation, but not with a single atomic 
instruction. Let addr be the location that is the common target of the 
Load And Reserve and Store Conditional instructions. Then the guaran- 
tee the architecture makes for the successful execution of the conditional 
sequence is that no store into addr by another processor or mechanism 
has intervened between the source of the Load And Reserve and the Store 
Conditional, 

For each of these examples, it is assumed that a similar sequence of 
instructions is used by all processes requiring synchronization on the 
accessed data. 

The examples deal with words: they can be used for doublewords by 
changing all Iwarx instructions to Idarx, all stwcx, instructions to stdcx,, 
all stw instructions to std, and all cmpw[i] extended mnemonics to 
cmpd[i], 

E.1.1 Synchronization Primitives 

The following examples show how the Iwarx and stwcx, instructions can 
be used to emulate various synchronization primitives. 

The sequences used to emulate the various primitives consist primarily 
of a loop using Iwarx and stwcx,. No additional synchronization is neces- 
sary, because the stwcx. will fail, setting the EQ bit to 0, if the word 
loaded by Iwarx has changed before the stwcx, is executed: see Book II, 
Section 1.8.2, "Atomic Update Primitives," on page 336 for more detail. 

Fetch and No-op 

The "Fetch and No-op" primitive atomically loads the current value in a 
word in storage. 

In this example it is assumed that the address of the word to be loaded 
is in GPR 3 and the data loaded are returned in GPR 4. 
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Because the Storage 
Synchroniza tion 
instructions have 
implementation 
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reservations are 
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operating system should 
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programs that use these 
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the high-level 
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(Test and Set, Compare 
and Swap, etc.) needed 
by application programs. 
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should use these library 
programs, rather than 
use the Storage 
Synchronization 
instructions directly. 
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loop: Iwarx r4,0,r3 #load and reserve 

stwcx. r4,0,r3 #store old value if still reserved 

bne- loop #loop if lost reservation 

Note: 

1. The stwcx,, if it succeeds, stores to the target location the same value 
that was loaded by the preceding Iwarx. While the store is redundant 
with respect to the value in the location, its success ensures that the 
value loaded by the Iwarx was the current value, i.e., that the source of 
the value loaded by the Iwarx was the last store to the location that 
preceded the stwcx. in the coherence order for the location. 

Fetch and Store 

The "Fetch and Store" primitive atomically loads and replaces a word in 
storage. 

In this example it is assumed that the address of the word to be loaded 
and replaced is in GPR 3, the new value is in GPR 4, and the old value is 
returned in GPR 5. 

loop: Iwarx r5,0,r3 #load and reserve 

stwcx. r4,0,r3 #store new value if still reserved 

bne- loop #loop if lost reservation 

Fetch and Add 

The "Fetch and Add" primitive atomically increments a word in storage. 

In this example it is assumed that the address of the word to be incre- 
mented is in GPR 3, the increment is in GPR 4, and the old value is 
returned in GPR 5. 

loop: Iwarx r5,0,r3 #load and reserve 

add r0,r4,r5 #increment word 

stwcx. r0,0,r3 #store new value if still reserved 

bne- loop #loop if lost reservation 

Fetch and AND 

The "Fetch and AND" primitive atomically ANDs a value into a word in 
storage. 

In this example it is assumed that the address of the word to be 
ANDed is in GPR 3, the value to AND into it is in GPR 4, and the old 
value is returned in GPR 5. 

loop: Iwarx r5,0,r3 #load and reserve 

and r0,r4,r5 #AND word 

stwcx. r0,0,r3 #store new value if still reserved 

bne- loop #loop if lost reservation 
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Note: 

1 . The sequence given above can be changed to perform another Boolean 
operation atomically on a word in storage, simply by changing the 
and instruction to the desired Boolean instruction {or, xor, etc.). 

Test and Set 

This version of the "Test and Set" primitive atomically loads a word from 
storage, sets the word in storage to a nonzero value if the value loaded is 
zero, and sets the EQ bit of CR Field 0 to indicate whether the value 
loaded is zero. 

In this example it is assumed that the address of the word to be tested 
is in GPR 3, the new value (nonzero) is in GPR 4, and the old value is 
returned in GPR 5. 

loop: Iwarx r5,0,r3 #load and reserve 

cmpwi* r5,0 #done 1f word 

bne- $+12 # not equal to 0 

stwcx. r4,0,r3 #try to store non-0 

bne- loop #loop if lost reservation 

Compare and Swap 

The "Compare and Swap" primitive atomically compares a value in a 
register with a word in storage, if they are equal stores the value from a 
second register into the word in storage, if they are unequal loads the 
word from storage into the first register, and sets the EQ bit of CR Field 0 
to indicate the result of the comparison. 

In this example it is assumed that the address of the word to be tested 
is in GPR 3, the comparand is in GPR 4 and the old value is returned 
there, and the new value is in GPR 5. 

loop: Iwarx r6,0,r3 #load and reserve 

cmpw r4,r6 #lst 2 operands equal? 

bne- exit #skip if not 

stwcx. r5,0,r3 #store new value if still reserved 

bne- loop #loop if lost reservation 

exit: mr r4,r6 #return value from storage 

Notes: 

1. The semantics given for "Compare and Swap" above are based on 
those of the IBM System/370 Compare and Swap instruction. Other 
architectures may define a Compare and Swap instruction differently. 

2. Compare and Swap is shown primarily for pedagogical reasons. It is 
useful on machines that lack the better synchronization facilities pro- 
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vided by Iwarx and stwcx,. A major weakness of a System/3 70-style 
Compare and Swap instruction is that, although the instruction itself 
is atomic, it checks only that the old and current values of the word 
being tested are equal, with the result that programs that use such a 
Compare and Swap to control a shared resource can err if the word 
has been modified and the old value subsequently restored. The 
sequence shown above has the same weakness. 

3. In some applications the second bne- instruction and/or the mr 
instruction can be omitted. The bne- is needed only if the application 
requires that if the EQ bit of CR Field 0 on exit indicates "not equal" 
then (r4) and (r6) are in fact not equal. The mr is needed only if the 
application requires that if the comparands are not equal then the 
word from storage is loaded into the register with which it was com- 
pared (rather than into a third register). If either or both of these 
instructions is omitted, the resulting Compare and Swap does not obey 
System/370 semantics. 

E.1.2 Lock Acquisition and Release 

This example gives an algorithm for locking that demonstrates the use of 
synchronization with an atomic read/modify/write operation. A shared 
storage location, the address of which is an argument of the "lock" and 
"unlock" procedures, given by GPR 3, is used as a lock, to control access 
to some shared resource such as a shared data structure. The lock is open 
when its value is 0 and closed (locked) when its value is 1. Before access- 
ing the shared resource, a processor sets the lock by changing its value 
from 0 to 1. To do this, the "lock" procedure calls test_and_set^ which 
executes the code sequence shown in the "Test and Set" example of Sec- 
tion E.1.1, thereby atomically loading the old value of the lock, writing to 
the lock the new value ( 1 ) given in GPR 4, returning the old value in GPR 
5 (not used below), and setting the EQ bit of CR Field 0 according to 
whether the value loaded is 0. The "lock" procedure repeats the 
test_and_set until it succeeds in changing the value of the lock from 0 
to 1. 

The processor must not access the shared resource until it sets the 
lock. After the bne- that checks for the success of test_and__set, the pro- 
cessor executes an isync instruction (see Book II, "Instruction 
Synchronize XL- form," on page 346). This delays all subsequent 
instructions until all previous instructions have completed to the extent 
required by context synchronization (see Book III, Section 1.7.1, "Con- 
text Synchronization," on page 371). sync could be used, but perfor- 
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mance would be degraded unnecessarily because sync waits for all prior 
storage accesses to complete with respect to all other processors, which is 
not necessary here. 

lock: 11 r4,l #obta1n lock: 

loop: bl test_and_set # test-and-set 

bne- loop # retry til old = 0 

# Delay subsequent inst'ns til prior inst'ns finish 

isync 

blr ^return 

The "unlock" procedure writes a 0 to the lock location. Most applica- 
tions that use locking require, for correctness, that if the access to the 
shared resource included write operations, the processor must execute a 
sync instruction to make its modifications visible to all processors before 
releasing the lock. In this example, the "unlock" procedure begins with a 
sync for this purpose. 

unlock: sync #delay til prior stores finish 

11 rl,0 #store 0 to lock location 
stw rl,0(r3) 
blr #return 

E.1.3 List Insertion 

This example shows how the Iwarx and stwcx. instructions can be used 
to implement simple insertion into a singly linked list. (Complicated list 
insertion, in which multiple values must be changed atomically, or in 
which the correct order of insertion depends on the contents of the ele- 
ments, cannot be implemented in the manner shown below and requires a 
more complicated strategy such as using locks.) 

The "next element pointer" from the list element after which the new 
element is to be inserted, here called the "parent element," is stored into 
the new element, so that the new element points to the next element in 
the list: this store is performed unconditionally. Then the address of the 
new element is conditionally stored into the parent element, thereby add- 
ing the new element to the list. 

In this example it is assumed that the address of the parent element is 
in GPR 3, the address of the new element is in GPR 4, and the next ele- 
ment pointer is at offset 0 from the start of the element. It is also assumed 
that the next element pointer of each list element is in a "reservation 
granule" separate from that of the next element pointer of all other list 
elements: see Book II, Section 1.8.2, "Atomic Update Primitives," on 
page 336. 
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loop: Iwarx r2,0.r3 #get next pointer 

stw r2,0(r4) #store in new element 

sync #let store settle (can omit if not MP) 

stwcx. r4,0,r3 #aclcl new element to list 

bne- loop #loop if stwcx. failed 

In the preceding example, if two list elements have next element point- 
ers in the same reservation granule then, in a multiprocessor, "livelock" 
can occur. (Livelock is a state in w^hich processors interact in a w^ay such 
that no processor makes progress.) 

If it is not possible to allocate list elements such that each element's 
next element pointer is in a different reservation granule, then livelock 
can be avoided by using the following, more complicated, sequence. 





1 wz 


r2,0(r3) 


#get next pointer 


1 oopl : 


mr 


r5.r2 


#keep a copy 




stw 


r2,0(r4) 


#store in new element 




sync 




#let store settle 


loop2: 


Iwarx 


r2,0,r3 


#get it again 




cmpw 


r2,r5 


#loop if changed (someone 




bne- 


1 oopl 


# el se progressed) 




stwcx. 


r4,0,r3 


#add new element to list 




bne- 


loop2 


#loop if stwcx. failed 



E.1.4 Notes 

1 - In general, Iwarx and stwcx, instructions should be paired, with the 
same effective address used for both. The exception is an isolated 
stwcx. instruction that is used to clear any existing reservation on the 
processor, for which there is no paired Iwarx and for which any 
(scratch) effective address can be used. 

2. It is acceptable to execute a Iwarx instruction for which no stwcx, 
instruction is executed. For example, this occurs in the "Test and Set" 
sequence shown above if the value loaded is not zero. 

3. To increase the likelihood that forward progress is made, it is impor- 
tant that looping on Iwarx/stwcx, pairs be minimized. For example, in 
the sequence shown above for "Test and Set," this is achieved by test- 
ing the old value before attempting the store: were the order reversed, 
more stwcx, instructions might be executed, and reservations might 
more often be lost between the Iwarx and the stwcx, . 

4. The manner in which Iwarx and stwcx, are communicated to other 
processors and mechanisms, and between levels of the storage sub- 
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system within a given processor (see Book II, PowerPC Virtual Envi- 
ronment Architecture), is implementation-dependent. In some 
implementations, performance may be improved by minimizing loop- 
ing on a Itvarx instruction that fails to return a desired value. For 
example, in the "Test and Set" example shown above, if the program- 
mer wishes to stay in the loop until the word loaded is zero, he could 
change the "bne- $+12" to "bne- loop". However, in some implemen- 
tations better performance may be obtained by using an ordinary 
Load instruction to do the initial checking of the value, as follows. 



1 wz 


r5,0(r3) 


#load the word 


cmpw1 


rS.O 


//loop back if word 


bne- 


1 oop 


# not equal to 0 


1 warx 


r5,0,r3 


#try again, reserving 


cmpwi 


r5.0 


# ( 1 i kely to succeed) 


bne- 


1 oop 




stwcx . 


r4,0.r3 


#try to store non-0 


bne- 


1 oop 


#loop if lost reservation 



5- In a multiprocessor, livelock is possible if a loop containing a Iwarxl 
stwcx, pair also contains an ordinary Store instruction for which any 
byte of the affected storage area is in the reservation granule: see Book 
II, Section 1.8.2, "Atomic Update Primitives," on page 336. For exam- 
ple, the first code sequence shown in Section E.1.3 can cause livelock if 
two list elements have next element pointers in the same reservation 
granule. 

E.2 Multiple-Precision Shifts 

This section gives examples of how multiple-precision shifts can be pro- 
grammed. 

A multiple-precision shift is initially defined to be a shift of an N-dou- 
bleword quantity (64-bit mode) or an N-word quantity (32-bit mode), 
where N>1. The quantity to be shifted is contained in N registers (in the 
low-order 32 bits in 32-bit mode). The shift amount is specified either by 
an immediate value in the instruction, or by bits 57:63 (64- bit mode) or 
58:63 (32-bit mode) of a register. 

The examples shown below distinguish between the cases N=2 and 
N>2. If N=2, the shift amount may be in the range 0 through 127 (64-bit 
mode) or 0 through 63 (32-bit mode), which are the maximum ranges 
supported by the Shift instructions used. However, if N>2, the shift 
amount must be in the range 0 through 63 (64-bit mode) or 0 tbrough 31 
(32-bit mode), in order for the examples to yield the desired result. The 
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specific instance shown for N>2 is N=3: extending those code sequences 
to larger N is straightforward, as is reducing them to the case N=2 when 
the more stringent restriction on shift amount is met. For shifts with 
immediate shift amounts only the case N=3 is shown, because the more 
stringent restriction on shift amount is always met. 

In the examples it is assumed that GPRs 2 and 3 (and 4) contain the 
quantity to be shifted, and that the result is to be placed into the same 
registers, except for the immediate left shifts in 64-bit mode for which the 
result is placed into GPRs 3, 4, and 5. In all cases, for both input and 
result, the lowest-numbered register contains the highest-order part of the 
data and highest-numbered register contains the lowest-order part. In 32- 
bit mode, the high-order 32 bits of these registers are assumed not to be 
part of the quantity to be shifted or of the result. For non-immediate 
shifts, the shift amount is assumed to be in bits 57:63 (64-bit mode) or 
58:63 (32-bit mode) of GPR 6. For immediate shifts, the shift amount is 
assumed to be greater than 0. GPRs 0 and 31 are used as scratch regis- 
ters. 

For N>2, the number of instructions required is 2N-1 (immediate 
shifts) or 3N-1 (non-immediate shifts). 



Multiple-precision shifts in 64-bit mode 



Multiple-precision shifts in 32-bit mode 



Shift Left Immediate, N = 3 (shift amnt < 64) 



Shift Left Immediate, N = 3 (shift amnt < 32) 



rldicr 


r5,r4,sh,63-sh 


rlwinm 


r2,r2,sh,0,31-sh 


rldimi 


r4,r3,0,sh 


rlwimi 


r2,r3,sh,32-sh,31 


rldicl 


r4,r4,sh,0 


rlwinm 


r3,r3,sh,0,31-sh 


rldimi 


r3,r2,0,sh 


rlwimi 


r3,r4,sh,32-sh,31 


rldicl 


r3,r3,sh,0 


rlwinm 


r4,r4,sh,0,31-sh 


Shift Left, N = 


2 (shift amnt < 128) 


Shift Left, N = 2 (shift amnt < 64) 


subfic 


r31,r6,64 


subfic 


r31,r6,32 


sld 


r2,r2,r6 


slw 


r2,r2,r6 


srd 


r0,r3,r31 


srw 


r0,r3,r31 


or 


r2,r2,r0 


or 


r2,r2,r0 


addi 


r31,r6,-64 


addi 


r31,r6,-32 


sld 


r0,r3,r31 


slw 


rO,r3,r31 


or 


r2,r2,r0 


or 


r2,r2,r0 


sld 


r3,r3,r6 


slw 


r3,r3,r6 
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Multiple-precision shifts in 64-bit mode 



Multiple-precision shifts in 32-bit mode 



Shift Left, N = 3 (shift amnt < 64) 



Shift Left, N = 3 (shift amnt < 32) 



subfic 


r31,r6,64 


subfic 


r31,r6,32 


sld 


r2,r2,r6 


slw 


r2,r2,r6 


srd 


r0,r3,r31 


srw 


r0,r3,r31 


or 


r2,r2,r0 


or 


r2,r2,r0 


sld 


r3,r3,r6 


slw 


r3,r3,r6 


srd 


r0,r4,r31 


srw 


r0,r4,r31 


or 


r3,r3,rO 


or 


r3,r3,r0 


sld 


r4,r4,r6 


slw 


r4,r4,r6 



Shift Right Immediate, N - 3 (shift amnt < 64) 



Shift Right Immediate, N = 3 (shift amnt < 32) 



rldimi 
rldicl 
rldimi 
rldicl 
rldicl 

Shift Right, N : 
subfic 
srd 
sld 
or 

addi 
srd 
or 
srd 

Shift Right, N = 
subfic 
srd 
sld 
or 
srd 
sld 
or 
srd 



r4,r3,0,64-sh 
r4,r4,64-sh,0 
r3,r2,0,64-sh 
r3,r3,64-sh,0 
r2,r2,64-sh,sh 

= 2 (shift amnt < 128) 
r31,r6,64 
r3,r3,r6 
r0,r2,r31 
r3,r3,r0 
r31,r6,-64 
r0,r2,r31 
r3,r3,r0 
r2,r2,r6 

= 3 (shift amnt < 64) 
r31,r6,64 
r4,r4,r6 
r0,r3,r31 
r4,r4,r0 
r3,r3,r6 
r0,r2,r31 
r3,r3,rO 
r2,r2,r6 



rlwinm 
rlwimi 
rlwinm 
rlwimi 
rlwinm 

Shift Right, N = 
subfic 
srw 
slw 
or 

addi 
srw 
or 
srw 

Shift Right, N = 
subfic 
srw 
slw 
or 
srw 
slw 
or 
srw 



r4,r4,32-sh,sh,31 
r4,r3,32-sh,0,sh-l 
r3,r3,32-sh,sh,31 
r3,r2,32-sh,0,sh-l 
r2,r2,32-sh,sh,31 

= 2 (shift amnt < 64) 
r31,r6,32 
r3,r3,r6 
r0,r2,r31 
r3,r3,r0 
r31,r6,-32 
r0,r2,r31 
r3,r3,r0 
r2,r2,r6 

= 3 (shift amnt < 32) 
r31,r6,32 
r4,r4,r6 
r0,r3,r31 
r4,r4,r0 
r3,r3,r6 
r0,r2,r31 
r3,r3,rO 
r2,r2,r6 



Shift Right Algebraic Immediate, N = 3 (shift amnt < 64) 
rldimi r4,r3,0,64-sh 
rldicl r4,r4,64-sh,0 
rldimi r3,r2,0,64-sh 
rldicl r3,r3,64-sh,0 
sradi r2,r2,sh 

ShiiL Right Algebraic, N = 2 (shift amnt < 128) 
subfic r31,r6,64 
srd r3,r3,r6 



Shift Right Algebraic Immediate, N = 3 (shift amnt < 32) 



rlwinm 
rlwimi 
rlwinm 
rlwimi 
srawi 



r4,r4,32-sh,sh,31 
r4,r3,32-sh,0,sh-l 
r3,r3,32-sh,sh,31 
r3,r2,32-sh,0,sh-l 
r2,r2,sh 



Shift Right Algebraic, N = 2 (shift amnt < 64) 
subfic r31,r6,32 
srw r3,r3,r6 
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Multiple-precision shifts in 64-bit mode Multiple-precision shifts in 32-bit mode 



sld r0,r2,r31 slw r0,r2,r31 

or r3,r3,r0 or r3,r3,rO 

addic. r31,r6,-64 addic. r31,r6,-32 

srad r0,r2,r31 sraw r0,r2,r31 

ble $+8 ble $+8 

ori r3,rO,0 ori r3,rO,0 

srad r2,r2,r6 sraw r2,r2,r6 

Shift Right Algebraic, N = 3 (shift amnt < 64) Shift Right Algebraic, N = 3 (shift amnt < 32) 

subfic r31,r6,64 subfic r31,r6,32 

srd r4,r4,r6 srw r4,r4,r6 

sld rO,r3,r31 slw r0,r3,r31 

or r4,r4,r0 or r4,r4,r0 

srd r3,r3,r6 srw r3,r3,r6 

sld r0,r2,r31 slw r0,r2,r31 

or r3,r3,rO or r3,r3,r0 

srad r2,r2,r6 sraw r2,r2,r6 



E.3 Floating-Point Conversions 

This section gives examples of how the Floating-Point Conversion 
instructions can be used to perform various conversions. 

Warning: Some of the examples use the fsel instruction. Care must be 
taken in using fsel if IEEE compatibility is required or if the values being 
tested can be NaNs or infinities: see Section E.4.4, "Notes," on page 266. 

E.3.1 Conversion from Floating-Point 
Number to Floating-Point Integer 

In a 64-bit implementation 

The full convert to floating-point integer function can be implemented 
with the sequence shown below, assuming the floating-point value to be 
converted is in FPR 1, and the result is returned in FPR 3. 



mtfsbO 


23 


#clear VXCVI 




fctidCz] 


f3,fl 


#convert to fx 


i nt 


fcfid 


f3,f3 


#convert back 


agai n 


mcrf s 


7,5 


#VXCVI to CR 




bf 


31, $+8 


#sk1p If VXCVI 


was 0 


fmr 


f3,fl 


#1nput was fp 


1nt 
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In a 32-bit implementation 

This example will be provided in a subsequent edition. 

E.3.2 Conversion from Floating-Point 
Number to Signed Fixed-Point integer 
Doubieword 

This example applies to 64-bit implementations only. 

The full convert to signed fixed-point integer doubieword function can 
be implemented with the sequence shown below, assuming the floating- 
point value to be converted is in FPR 1, the result is returned in GPR 3, 
and a doubieword at displacement "disp" from the address in GPR 1 can 
be used as scratch space. 

fctidCz] f2,fl #convert to dword int 

stfd f2.disp(rl) #store float 
Id r3,d1sp(rl) #load dword 

E.33 Conversion from Floating-Point 
Number to Unsigned Fixed-Point Integer 
Doubieword 

This example applies to 64-bit implementations only. 

The full convert to unsigned fixed-point integer doubieword function 
can be implemented with the sequence shown below, assuming the float- 
ing-point value to be converted is in FPR 1, the value 0 is in FPR 0, the 
value 2^^^-2048 is in FPR 3, the value 2^^ is in FPR 4 and GPR 4, the 
result is returned in GPR 3, and a doubieword at displacement "disp" 
from the address in GPR 1 can be used as scratch space. 



fsel 


f2,fl,fl,f0 


#use 0 if < 0 


fsub 


f5.f3,fl 


#use max if > max 


fsel 


f2,f5,f2,f3 




fsub 


f5,f2,f4 


#subtract 2^*63 


fcmpu 


cr2,f2,f4 


#use diff if > 2**63 


fsel 


f2,f5,f5,f2 




fctidCz] 


f2,f2 


//convert to fx int 


stfd 


f2,d1sp(rl) 


#store float 


Id 


r3,disp(rl) 


#load dword 


bit 


cr2,$+8 


#add 2**63 if input 


A A 

auu 


r3 , r3 , r4 


# was > 2**63 
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E.3.4 Conversion from Floating-Point 
Number to Signed Fixed-Point integer Word 

The full convert to signed fixed-point integer word function can be imple- 
mented with the sequence shown below, assuming the floating-point 
value to be converted is in FPR 1, the result is returned in GPR 3, and a 
doubleword at displacement "disp" from the address in GPR 1 can be 
used as scratch space. The last instruction is needed only if a 64-bit result 
is required, and applies to 64-bit implementations only. 

fct1w[z] f2,fl ^convert to fx int 

stfd f2.d1sp(rl) #store float 

Iwz r3,clisp+4(rl) #load word and zero 

extsw r3,r3 #(for 64-bit result) 

E.3. 5 Conversion from Floating-Point 
Number to Unsigned Fixed-Point Integer 
Word 

In a 64-bit implementation 

The full convert to unsigned fixed-point integer word function can be 
implemented with the sequence shown below, assuming the floating-point 
value to be converted is in FPR 1, the value 0 is in FPR 0, the value 2^^-l 
is in FPR 3, the result is returned in GPR 3, and a doubleword at dis- 
placement "disp" from the address in GPR 1 can be used as scratch 
space. 



fsel 


f2,fl,fl,f0 


#use 0 if < 0 




f sub 


f4,f3,fl 


#use max if > max 


fsel 


f2,f4,f2.f3 






fctidCz] 


f2,f2 


//convert to fx 


int 


stfd 


f2,disp(rl) 


#store float 




Iwz 


r3,disp+4(rl) 


#load word and 


zero 



In a 32-bit implementation 

The full convert to unsigned fixed-point integer word function can be 
implemented with the sequence shown below, assuming the floating-point 
value to be converted is in FPR 1, the value 0 is in FPR 0, the value 2^^ is 
in FPR 3, the value 2^^ is in FPR 4, the result is returned in GPR 3, and a 
doubleword at displacement "disp" from the address in GPR 1 can be 
used as scratch space. 
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fsel f2.fl,fl,f0 

fsub f5.f3,fl 

fsel f2,f5.f2,f3 

fsub f5,f2,f4 

fcmpu cr2,f2,f4 

fsel f2,f5,f5,f2 

fctiwCz] f2.f2 

stfd f2,d1sp(rl) 

Iwz r3,d1sp+4(rl) 

bit cr2,$+8 

xoris r3,r3, 0x8000 



#use 0 if < 0 
#use max if > max 

#subtract 2^^31 
#use diff if > 2**31 

//convert to fx int 
//store float 
#load word 
#add 2**31 if input 
# was > 2**31 



E.3.6 Conversion from Signed Fixed-Point 
integer Doubleword to Floating-Point 
IMumber 

This example applies to 64- bit implementations only. 

The full convert from signed fixed-point integer doubleword function, 
using the rounding mode specified by FPSCRrn? implemented 
with the sequence shown below, assuming the fixed-point value to be 
converted is in GPR 3, the result is returned in FPR 1, and a doubleword 
at displacement "disp" from the address in GPR 1 can be used as scratch 
space. 

std r3,disp(rl) #store dword 

Ifd fl,disp(rl) #load float 

fcfid fl.fl //convert to fp int 



E.3.7 Conversion from Unsigned Fixed-Point 
integer Doubleword to Floating-Point 
Number 

This example applies to 64-bit implementations only. 

The full convert from unsigned fixed-point integer doubleword func- 
tion, using the rounding mode specified by FPSCRrn? t)^ imple- 
mented with the sequence shown below, assuming the fixed-point value 
to be converted is in GPR 3, the value 2^^ is in FPR 4, the result is 
returned in FPR 1, and two doublewords at displacement "disp" from 
the address in GPR 1 can be used as scratch space. 
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rldicl r2.r3,32,32 #1solate high half 

rldicl r0,r3,0,32 #isolate low half 

std r2,disp(rl) #store dword both 

std r0,disp+8(rl) 

Ifd f2,disp(rl) #load float both 

Ifd fl,disp+8(rl) 

fcfid f2,f2 ^convert each half to 

fcfid fl,fl # fp 1nt (no round) 

fmadd fl,f4.f2,fl #( 2**32 )*hi gh + low 

# (only add can round) 

An alternative, shorter, sequence can be used if rounding according to 
FSCPRrn is desired and FPSCRr^ specifies Round toward ^Infinity or 
Round toward -Infinity, or if it is acceptable for a rounded answer to be 
either of the two representable floating-point integers nearest to the given 
fixed-point integer. In this case the full convert from unsigned fixed-point 
integer doubleword function can be implemented with the sequence 
shown below, assuming the value 2^"^ is in FPR 2. 



std 


r3,disp(rl) 


#store dword 


Ifd 


fl,disp(rl) 


#load float 


fcfid 


fl,fl 


#convert to fp int 


fadd 


f4,fl.f2 


#add 2**64 


fsel 


fl.fl,fl,f4 


# if r3 < 0 



E3.8 Conversion from Signed Fixed-Point 
integer Word to Floating-Point Number 

In a 64-bit implementation 

The full convert from signed fixed-point integer word function can be 
implemented with the sequence shown below, assuming the fixed-point 
value to be converted is in GPR 3, the result is returned in FPR 1, and a 
doubleword at displacement "disp" from the address in GPR 1 can be 
used as scratch space. (Rounding cannot occur.) 

extsw r3,r3 ^extend sign 

std r3,d1sp(rl) #store dword 

Ifd fl,disp(rl) #load float 

fcfid fl,fl #convert to fp int 

In a 32-bit implementation 

This example will be provided in a subsequent edition. 
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E.3.9 Conversion from Unsigned Fixed-Point 
integer Word to Floating-Point Number 

In a 64-bit implementation 

The full convert from unsigned fixed-point integer word function can be 
implemented with the sequence shown below, assuming the fixed-point 
value to be converted is in GPR 3, the result is returned in FPR 1, and a 
doubleword at displacement "disp" from the address in GPR 1 can be 
used as scratch space. (Rounding cannot occur.) 

rldicl r0,r3,0,32 #zero-extend 

std rO,disp(rl) #store dword 

Ifd fl,d1sp(rl) #load float 

fcfid fl,fl #convert to fp int 

In a 32-bit implementation 

This example will be provided in a subsequent edition. 



E.4 Floating-Point Selection 

This section gives examples of how the Floating Select instruction can be 
used to implement floating-point minimum and maximum functions, and 
certain simple forms of if-then-else constructions, without branching. 

The examples show program fragments in an imaginary, C-like, high- 
level programming language, and the corresponding program fragment 
using fsel and other PowerPC instructions. In the examples, a, b, x, y, and 
z are floating-point variables, which are assumed to be in FPRs fa, fb, fx, 
fy^ and fz. FPR fs is assumed to be available for scratch space. 

Additional examples can be found in Section E.3, "Floating-Point 
Conversions," on page 259. 

Warning: Care must be taken in using fsel if IEEE compatibility is 
required or if the values being tested can be NaNs or infinities: see Sec- 
tion E.4.4, "Notes," on page 266. 
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E.4.1 Comparison to Zero 

High-level language: 



if a > 0.0 then x <- y 
el se X <- z 



PowerPC: 
fsel fx,fa,fy,fz 



Notes 

(1) 



if a > 0.0 then x 4- y 
el se X z 



fneg fs,fa 

fsel fx,fs,fz.fy 



(1,2) 



1f a = 0.0 then x <- y 
el se X <- z 



fsel fx,fa,fy,fz 

fneg fs.fa 

fsel fx,fs.fx,fz 



(1) 



E.4.2 Minimum and iViaximum 



High-level language: 

X <- min(a ,b) 



PowerPC: 

fsub fs,fa,fb 
fsel fx,fs,fb.fa 



Notes 
(3,4,5) 



X <- max(a ,b) 



fsub fs,fa,fb 
fsel fx,fs,fa,fb 



(3,4.5) 



E.4.3 Simple if-then-eise Constructions 



High-level language: 

if a > b then x y 
else X <- z 



PowerPC: 

fsub fs,fa,fb 
fsel fx,fs,fy,fz 



Notes 
(4,5) 



if a > b then x <- y 
el se X <- z 



fsub fs,fb,fa 
fsel fx,fs,fz,fy 



(3,4,5) 



if a = b then x y 
el se X <- z 



fsub fs,fa,fb 

fsel fx.fs,fy,fz 

fneg fs,fs 

fsel fx,fs,fx,fz 



(4,5) 
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E.4.4 Notes 

The following Notes apply to the preceding examples and to the corre- 
sponding cases using the other three arithmetic relations (<, <, and ?^). 
They should also be considered when any other use of fsel is contem- 
plated. 

In these Notes, the "optimized program" is the PowerPC program 
shown, and the "unoptimized program" (not shown) is the correspond- 
ing PowerPC program that uses fcmpu and Branch Conditional instruc- 
tions instead of fsel, 

1 . The unoptimized program affects the VXSNAN bit of the FPSCR, and 
therefore may cause the system error handler to be invoked if the cor- 
responding exception is enabled, while the optimized program does 
not affect this bit. This property of the optimized program is incom- 
patible with the IEEE standard. 

2. The optimized program gives the incorrect result if <3! is a NaN. 

3. The optimized program gives the incorrect result if a and/or ^? is a 
NaN (except that it may give the correct result in some cases for the 
minimum and maximum functions, depending on how those functions 
are defined to operate on NaNs). 

4- The optimized program gives the incorrect result if a and b are infini- 
ties of the same sign. (Here it is assumed that Invalid Operation 
Exceptions are disabled, in which case the result of the subtraction is a 
NaN. The analysis is more complicated if InvaUd Operation Excep- 
tions are enabled, because in that case the target register of the sub- 
traction is unchanged.) 

5. The optimized program affects the OX, UX, XX, and VXISI bits of 
the FPSCR, and therefore may cause the system error handler to be 
invoked if the corresponding exceptions are enabled, while the unopti- 
mized program does not affect these bits. This property of the opti- 
mized program is incompatible with the IEEE standard. 
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Changed POWER 
Mnemonics 




The following table lists the POWER instruction mnemonics that have 
been changed in the PowerPC Architecture, sorted by POWER mne- 
monic. 

To determine the PowerPC mnemonic for one of these POWER mne- 
monics, find the POWER mnemonic in the second column of the table: 
the remainder of the line gives the PowerPC mnemonic and the page on 
which the instruction is described, as well as the instruction names. The 
Book number is shown in the "Page" column for instructions that are not 
defined in Book 1. 

POWER mnemonics that have not changed are not listed. POWER 
instruction names that are the same in PowerPC are not repeated: i.e., for 
these, the last column of the table is blank. 



Page 


POWER 


PowerPC 


Mnemonic 


Instruction 


Mnemonic 


Instruction 


85 


a[o][.] 


Add 


addc[o][.] 


Add Carrying 


86 


ae[o][.] 


Add Extended 


adde[o][.] 




84 


ai 


Add Immediate 


addic 


Add Immediate Carrying 


84 


ai. 


Add Immediate and Record 


addic. 


Add Immediate Carrying and Record 


87 


ame[o][.] 


Add To Minus One Extended 


addme[o][.] 


Add to Minus One Extended 
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Page 


POWER 


PowerPC 


Mnemonic 


Instruction 


Mnemonic 


Instruction 


106 


andil. 


AND Immediate Lower 


andi. 


AND Immediate 


106 


andiu. 


AND Immediate Upper 


andis. 


AND Immediate Shifted 


88 


aze[o][.] 


Add To Zero Extended 


addze[o][.] 


Add to Zero Extended 


40 


bcc[l] 


Branch Conditional to Count Register 


bcctrp] 




39 


bcr[l] 


Branch Conditional to Link Register 


bclr[l] 




82 


cal 


Compute Address Lower 


addi 


Add Immediate 


82 


cau 


Compute Address Upper 


addis 


Add Immediate Shifted 


83 


cax[o][.] 


Compute Address 


add[o][.] 


Add 


114 


cnriz[.] 


Count Leading Zeros 


cntlzw[.] 


Count Leading Zeros Word 


347 (II) 


dclz 


Data Cache Line Set to Zero 


dcbz 


Data Cache Block set to Zero 


80 


dcs 


Data Cache Synchronize 


sync 


Synchronize 


113 


exts[.] 


Extend Sign 


extsh[.] 


Extend Sign Halfword 


179 


fa[.] 


Floating Add 


fadd[.] 




182 


fd[.] 


Floating Divide 


fdiv[.] 




181 


fm[.] 


Floating Multiply 


fmul[.] 




183 


fma[.] 


Floating Multiply- Add 


fmadd[.] 




184 


fms[.] 


Floating Multiply-Subtract 


fmsub[.] 




185 


fnma[.] 


Floating Negative Multiply- Add 


fnmadd[.] 




186 


fnms[.] 


Floating Negative Multiply-Subtract 


fnmsub[.] 




180 


fs[.] 


Floating Subtract 


fsub[.] 




346 (II) 


ics 


Instruction Cache Synchronize 


isync 


Instruction Synchronize 


55 


1 


Load 


Iwz 


Load Word and Zero 


69 


Ibrx 


Load Byte-Reverse Indexed 


Iwbrx 


Load Word Byte-Reverse Indexed 


71 


Im 


Load Multiple 


Imw 


Load Multiple Word 
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Page 


POWER 


PowerPC 


Mnemonic 


Instruction 


Mnemonic 


Instruction 


73 


Isi 


Load String Immediate 


Iswi 


Load String Word Immediate 


74 


Isx 


Load String Indexed 


Iswx 


Load String Word Indexed 


JO 


lu 


Load with Update 


Iwzu 


Load Word and Zero with Update 


57 


lux 


Load with Update Indexed 


Iwzux 


Load Word and Zero with Update In- 
dexed 


56 


Ix 


Load Indexed 


Iwzx 


Load W^ord and Zero Indexed 


441 (III) 


mtsri 


Move To Segment Register Indirect 


mtsrin 




90 


muli 


Multiply Immediate 


mulli 


Multiply Low Immediate 


71 


mnKFoir 1 


^/Inlti'nl V Snort 


mnllwrnir 1 

lllU.il W 1 vl J |_ • J 


\/fnltinlv T nw \X7nrn 


107 


oril 


OR Immediate Lower 


ori 


OR Immediate 


107 


oriu 


OR Immediate Upper 


oris 


OR Immediate Shifted 


122 


rlimi[.] 


Rotate Left Immediate Then Mask 


rlwimi[.] 


Rotate Left Word Immediate then 


119 


rlinm[.] 


RntJitp T <»ft Tmm<»Hiatp TVipn AND 

With Mask 


rlwinm[.] 


Rntatp T pff ^^c\rA Tmmprli'sitf* tlipn 

IVUtdLC i_iClL W \J1 V-i XllllllW'Ula LC LllCil 

AND with Mask 


121 


rlnm[.] 


Rotate Left Then AND With Mask 


rlwnm[.] 


Rotate Left Word then AND with 
Mask 


86 


sffoir 1 


Subtract From 


subfcfoir 1 


Subtract From Carrying 


87 


sferolf.l 


Subtract From Extended 


subfe[o][.] 




85 


sfi 


Subtract From Immediate 


subfic 


Subtract From Immediate Carrying 


88 


sfme[o][.] 


Subtract From Minus One Extended 


subfme[o][.] 




89 


sfze[o][.] 


Subtract From Zero Extended 


subfze[o][.] 




124 


sl[.] 


Shift Left 


slw[.] 


Shift Left Word 


125 


sr[.] 


Shift Right 


srw[.] 


Shift Right Word 


128 


sra[.] 


Shift Right Algebraic 


sraw[.] 


Shift Right Algebraic Word 


126 


srai[.] 


Shift Right Algebraic Immediate 


srawi[.] 


Shift Right Algebraic Word Immediate 
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POWER 


PowerPC 


Page 


Mnemonic 


Instruction 


Mnemonic 


Instruction 


64 


St 


Store 


stw 




70 


stbrx 


Store Byte~Reverse Indexed 




Store ^iJC^ord Byte-Reverse Indexed 


72 


stm 


Store Multiple 


stmw 


Store Mlultiple "Word 


75 


stsi 


Store String Immediate 


stswi 


Store String Word Immediate 


76 


stsx 


Store String Indexed 


stswx 


Store String Word Indexed 


OJ 


stu 


Store with Update 


stwu 


Store Word with Update 


cc 

DO 


stux 


Store with Update Indexed 


stwux 


Store Word with Update Indexed 


^ c 

OJ 


stx 


Store Indexed 


stwx 


Store Word Indexed 


41 


svca 


Supervisor Call 


sc 


System Call (see also Book III, 

page 378) 


105 


t 


Trap 


tw 


Trap Word 


103 


ti 


Trap Immediate 


twi 


Trap Word Immediate 


444 (III) 


tlbi 


TLB Invalidate Entry 


tlbie 




108 


xoril 


XOR Immediate Lower 


xori 


XOR Immediate 


108 


xoriu 


XOR Immediate Upper 


xoris 


XOR Immediate Shifted 
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This appendix identifies the known incompatibihties that must be man- 
aged in the migration from the POWER Architecture to the PowerPC 
Architecture. Some of the incompatibihties can, at least in principle, be 
detected by the processor, which could trap and let software simulate the 
POWER operation. Others cannot be detected by the processor even in 
principle. In general, the incompatibilities identified here are those that 
affect a POWER application program: incompatiblities for instructions 
that can be used only by POWER system programs are not necessarily 
discussed. 



Instructions new to PowerPC typically use opcode values (including 
extended opcode) that are illegal in POWER. A few instructions that are 
privileged in POWER (e.g., dclz, called dcbz in PowerPC) have been 
made nonprivileged in PowerPC. Any POWER program that executes 
one of these now-valid or now-nonprivileged instructions, expecting to 
cause the system illegal instruction error handler or the system privileged 
instruction error handler to be invoked, will not execute correctly on 
PowerPC. 



G.1 New Instructions, Formerly 
Privileged Instructions 
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G.2 Newly Privileged Instructions 

The following instructions are nonprivileged in POWER but privileged in 
PowerPC. 

mfmsr 

mfsr 

G.3 Reserved Bits in Instructions 

These are shown with 7's in the instruction layouts. In POWER such bits 
are ignored by the processor. In PowerPC they must be 0 or the instruc- 
tion form is invaHd. 

In several cases, the PowerPC Architecture assumes that such bits in 
POWER instructions are indeed 0. The cases include the following: 

■ cmpi, cmpy cmpli, and cmpl assume that bit 10 in the POWER instruc- 
tions is 0. 

■ mtspr and mfspr assume that bits 16:20 in the POWER instructions 
are 0. 

G.4 Reserved Bits in Registers 

POWER defines these bits to be 0 on read, and either 0 or 1 on write. In 
PowerPC it is implementation-dependent, for each bit, whether the bit is: 

■ 0 on read and ignored on write; or 

■ copied from source to target on both read and write. 

G.5 Alignment Check 

The POWER MSR AL bit (bit 24) is no longer supported: the bit is 
reserved in PowerPC. The low-order bits of the EA are always used. 
(Notice that the value 0 — the normal value for a reserved SPR bit — 
means "ignore the low-order EA bits" in POWER, and the value 1 means 
"use the low-order EA bits.") However, MSR bit 24 will not be assigned 
new meaning in the near future (see Book III, Section 2.2.3, "Machine 
State Register," on page 374), and software is permitted to write the 
value 1 to the bit. 
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G.6 Condition Register 

The following instructions specify a field in the CR explicitly (via the BF 
field) and also, in POWER, use bit 31 as the Record bit. In PowerPC, if 
bit 31=1 for these instructions the instruction form is invalid. In POWER, 
if Rc=l the instructions execute normally except as follows: 



cmp 


CRO is undefined if Rc= 


:1 and BFt^O 


cmpl 


CRO is undefined if Rc= 


:1 and BF:?fcO 


mcrxr 


CRO is undefined if Rc= 


:1 and BF?iO 


fcmpu 


CRl is undefined if Rc= 


:1 


fcmpo 


CRl is undefined if Rc= 


:1 


ntcrfs 


CRl is undefined if Rc= 


:1 and BF^tl 



G.7 Inappropriate use of LK and Rc 
Bits 

For the instructions listed below, if bit 31 (LK or Rc bit in POWER) is set 
to 1, POWER executes the instruction normally with the exception of set- 
ting the Link Register (if LK=1) or Condition Register Field 0 or 1 (if 
Rc=l) to an undefined value. In PowerPC such instruction forms are 
invalid. 

PowerPC instructions that are invalid form if bit 31=1 (LK bit in 
POWER): 

sc [SVC in POWER) 

the Condition Register Logical instructions 
mcrf 

isync (ics in POWER) 

PowerPC instructions that are invalid form if bit 31=1 (Rc bit in 
POWER): 

fixed-point X-form Load and Store instructions 
fixed-point X-form Compare instructions 
the X-form Trap instruction 
mtspr, mfspr, mtcrfy mcrxr, mfcr 
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floating-point X-form Load and Store instructions 

floating-point Compare instructions 

mcrfs 

dcbz (dclz in POWER) 

G.8 BO Field 

POWER shows certain bits in the BO field — used by Branch Conditional 
instructions — as "x." Although the POWER Architecture does not say 
how these bits are to be interpreted, they are in fact ignored by the pro- 
cessor. 

PowerPC shows these bits as either "z" or "y." The "z" bits are 
ignored, as in POWER. However, the "y" bit need not be ignored, but 
rather can be used to give a hint about whether the branch is likely to be 
taken. If a POWER program has the "wrong" value for this bit, the pro- 
gram will run correctly but performance may suffer. 

G.9 Branch Conditional to Count 
Register 

For the case in which the Count Register is decremented and tested (i.e., 
the case in which BO2=0), POWER specifies only that the branch target 
address is undefined, with the implication that the Count Register, and 
the Link Register if LK=1, are updated in the normal way. PowerPC con- 
siders this instruction form invalid. 

G.10 System Call 

There are several respects in which PowerPC is incompatible with 
POWER for System Call instructions — which in POWER are called 
Supervisor Call instructions. 

■ POWER provides a version of the Supervisor Call instruction (bit 
30=0) that allows instruction fetching to continue at any one of 128 
locations. It is used for "fast SVCs." PowerPC provides no such ver- 
sion: if bit 30 of the instruction is 0 the instruction form is invaHd. 

9 POWEPv provides a version of the Supervisor Call instruction (bits 
30;31=0bll) that resumes instruction fetching at one location and sets 
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the Link Register to the address of the next instruction. PowerPC pro- 
vides no such version: if bit 31 of the instruction is 1 the instruction 
form is invalid. 

■ For POWER, information from the MSR is saved in the Count Regis- 
ter. For Pow^erPC, this information is saved in SRRl. 

■ POWER permits bits 16:29 of the instruction to be nonzero, while in 
PowerPC such an instruction form is invalid. 

■ POWER saves the low-order 16 bits of the instruction, in the Count 
Register. PowerPC does not save them. 

■ The settings of MSR bits by the associated interrupt differ between 

POWER and PowerPC: see POWER Processor Architecture and 
Book III, Figure 80, "MSR setting due to interrupt," on page 458. 

G.11 Fixed-Point Exception Register 
(XER) 

Bits 16:23 of the XER are reserved in PowerPC, while in POWER they 
are defined and contain the comparison byte for the Iscbx instruction 
(which PowerPC lacks). 

G.I 2 Update Forms of Storage Access 

PowerPC requires that RA not be equal to either RT (fixed-point Load 
only) or 0. If the restriction is violated the instruction form is invalid. 
POWER permits these cases and simply avoids saving the EA. 

G.I 3 iViuitipie Register Loads 

PowerPC requires that RA, and RB if present in the instruction format, 
not be in the range of registers to be loaded, while POWER permits this 
and does not alter RA or RB in this case. (The PowerPC restriction 
applies even if RA=0, although there is no obvious benefit to the restric- 
tion in this case since RA is not used to compute the effective address if 
RA=0.) If the PowerPC restriction is violated, either the system illegal 
instruction error handler is invoked or the results are boundedly defined. 
The instructions affected are: 
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Imw [Im in POWER) 
Iswi (hi in POWER) 
Iswx (Isx in POWER) 

For example, an Imw instruction that loads all 32 registers is valid in 
POWER but is an invalid form in PowerPC. 

G.14 Alignment for Load/Store 
Multiple 

PowerPC requires the EA to be word-aligned, and yields an Alignment 
interrupt or boundedly undefined results if it is not. POWER specifies 
that an Alignment interrupt occurs (if MSRal=1)' 

G.I 5 Move Assist Instructions 

There are several respects in which PowerPC is incompatible with 
POWER for Move Assist instructions. 

■ In PowerPC an kwx instruction with zero length leaves the content of 
RT undefined, while in POWER the corresponding instruction (Isx) 
does not alter RT in this case. 

■ In PowerPC an Iswx instruction with zero length may alter the Refer- 
ence bit, and an stswx instruction with zero length may alter the Ref- 
erence and Change bits, while in POWER the corresponding 
instructions (Isx and stsx) do not alter the Reference and Change bits 
in this case. 

G.16 Synchronization 

The sync instruction (called dcs in POWER) and the isync instruction 
(called ics in POWER) cause much more pervasive synchronization in 
PowerPC than in POWER. 

G.17 iViove To/From SPR 

There are several respects in which PowerPC is incompatible with 
POWER for Move To/From Special Purpose Register instructions. 
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■ The SPR field is ten bits long in PowerPC, but only five in POWER 
(see also Section G.3, "Reserved Bits in Instructions," on page 272). 

■ mfspr can be used to read the Decrementer in problem state in 
POWER, but only in privileged state in PowerPC. 

■ If the SPR value specified in the instruction is not one of the defined 
values, POWER behaves as follows. 

— If the instruction is executed in problem state and SPRo=l, a Privi- 
leged Instruction type Program interrupt occurs. No architected 
registers are altered except those set by the interrupt. 

— Otherwise (the instruction is executed in privileged state or 
SPRo=0), no architected registers are altered. 

In this same case, PowerPC behaves as follows. 

— If the instruction is executed in problem state and spro=l, either an 
Illegal Instruction type Program interrupt or a Privileged Instruc- 
tion type Program interrupt occurs. No architected registers are 
altered except those set by the interrupt. 

— Otherwise (the instruction is executed in privileged state or spro=0), 
either an Illegal Instruction type Program interrupt occurs (in which 
case no architected registers are altered except those set by the 
interrupt) or the results are boundedly undefined. 

G.18 Effects of Exceptions on FPSCR 
Bits FR and Fi 

For the following cases, POWER does not say how FR and FI are set, 
while PowerPC preserves them for Invalid Operation Exception caused 
by a Compare instruction, sets FI to 1 and FR to an undefined value for 
disabled Overflow Exception, and clears them otherwise. 

■ Invalid Operation Exception (enabled or disabled) 

■ Zero Divide Exception (enabled or disabled) 

■ Disabled Overflow Exception 
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G.I 9 Floating-Point Store instructions 

POWER uses FPSCRug to help determine whether denormaUzation 
should be done, while PowerPC does not. Use of FPSCRu£ is in fact 
incorrect: if FPSCRu£=l and a denormalized single-precision number is 
copied from one storage location to another by means of Ifs followed by 
stfs, the two "copies" may not be the same. 

G.20 Move From FPSCR 

POWER defines the high-order 32 bits of the result of mffs to be 
0xFFFF_FFFF5 while PowerPC says they are undefined. 

G.21 Zeroing Bytes in the Data Cache 

The dclz instruction of POWER and the dcbz instruction of PowerPC 
have the same opcode. However, the functions differ in the following 
respects: 

■ dclz clears a line while dcbz clears a block. 

■ dclz saves the EA in RA (if RA^^^O) while dcbz does not. 

■ dclz is privileged while dcbz is not. 

G.22 Floating-Point Load/Store to 
Direct-Store Segment 

In POWER a floating-point Load or Store instruction to a direct-store 
segment causes a Data Storage interrupt, while in PowerPC the instruc- 
tion either executes correctly or causes an Alignment interrupt. 

G.23 Segment Register instructions 

The definitions of the four Segment Register instructions {mtsr, mtsrin, 
mfsr, and mfsrin) differ in two respects between POWER and PowerPC. 
Instructions similar to mtsrin and mfsrin are called tntsri and mfsti in 
POWER. 
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privilege: mfsr and mfsri are problem state instructions in POWER, 
while mfsr and mfsrin are privileged in Pov^erPC. 

function: the "indirect" instructions (mtsri and mfsri) in POWER 
use an RA register in computing the Segment Register num- 
ber, and the computed EA is stored into RA (if RAt^^O and 
RA9«^:RT), while in PowerPC mtsrin and mfsrin have no RA 
field and the EA is not stored. 

mtsr, mtsrin (mtsri), and mfsr have the same opcodes in PowerPC as in 
POWER, mfsri (POWER) and mfsrin (PowerPC) have different opcodes. 

G.24 TLB Entry Invalidation 

The tlbi instruction of POWER and the tlbie instruction of PowerPC 
have the same opcode. However, the functions differ in the following 
respects: 

■ tlbi computes the EA as (RAIO) + (RB), while tlbie lacks an RA field 
and computes the EA as (RB). 

■ tlbi saves the EA in RA (if RA^^O), while tlbie lacks an RA field and 
does not save the EA. 

G.25 Floating-Point Interrupts 

Both architectures use MSR bit 20 to control the generation of interrupts 
for floating-point enabled exceptions. However, in PowerPC this bit is 
part of a two-bit value that controls the occurrence, precision, and recov- 
erability of the interrupt, while in POWER this bit is used independently 
to control the occurrence of the interrupt (in POWER all floating-point 
interrupts are precise). 

G.26 Timing Facilities 
G.26.1 Real-Time Clock 

The POWER Real-Time Clock is not supported in PowerPC. Instead, 
PowerPC provides a Time Base. Both the RTC and the TB are 64-bit Spe- 
cial Purpose Registers, but they differ in the following respects: 



Book / PowerPC User Instruction Set Architecture 



280 



Appendix G Incompatibilities with the POWER Architecture 



■ The RTC counts seconds and nanoseconds, while the TB counts 
"ticks." The ticking rate of the TB is implementation-dependent. 

■ The RTC increments discontinuously: 1 is added to RTCU when the 
value in RTCL passes 999_999_999, The TB increments continuously: 
1 is added to TBU when the value in TBL passes OxFFFF_FFFF. 

■ The RTC is written and read by the mtspr and mfspr instructions, 
using SPR numbers that denote the RTCU and RTCL. The TB is writ- 
ten by the mfopr instruction (using new SPR numbers), and read by the 
new mftb instruction. 

■ The SPR numbers that denote POWER'S RTCL and RTCU are invalid 
in PowerPC. 

■ The RTC is guaranteed to increment at least once in the time required 
to execute ten Add Immediate instructions. No analogous guarantee is 
made for the TB. 

■ Not all bits of RTCL need be implemented, while all bits of the TB 
must be implemented. 

G.26.2 Decrementer 

The PowerPC Decrementer differs from the POWER Decrementer in the 
following respects: 

■ The PowerPC DEC decrements at the same rate that the TB incre- 
ments, while the POWER Decrementer decrements every nanosecond 
(which is the same rate that the RTC increments). 

■ Not all bits of the POWER DEC need be implemented, while all bits of 
the PowerPC DEC must be implemented. 

■ The interrupt caused by the DEC has its own interrupt vector location 
in PowerPC, but is considered an External interrupt in POWER. 



The following instructions are part of the POWER Architecture but have 
been dropped from the PowerPC Architecture. 



G.27 Deleted Instructions 



ahs 



Absolute 



clcs 



Cache Line Compute Size 
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clf 


Cache Line Flush 


cun 


Cache Line InvaUdate 


deist 


Data Cache Line Store 


div 


Divide 


divs 


Divide Short 


doz 


Difference Or Zero 


dozi 


Difference Or Zero Immediate 


Iscbx 


Load String And Compare Byte Indexed 


maskg 


Mask Generate 


maskir 


Mask Insert From Register 


mfsri 


Move From Segment Register Indirect 


mul 


Multiply 


nabs 


Negative Absolute 


racn 


Real Address Compute 


rfsvc{ *) 


Return From SVC 


rlmi 


Rotate Left Then Mask Insert 


rrib 


Rotate Right And Insert Bit 


sle 


Shift Left Extended 


sleq 


Shift Left Extended With MQ 


sliq 


Shift Left Immediate With MQ 


slliq 


Shift Left Long Immediate With MQ 


sllq 


Shift Left Long With MQ 


slq 


Shift Left With MQ 


sraiq 


Shift Right Algebraic Immediate With MQ 


sraq 


Shift Right Algebraic With MQ 


sre 


Shift Right Extended 


srea 


Shift Right Extended Algebraic 


sreq 


Shift Right Extended With MQ 
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sriq Shift Right Immediate With MQ 

srliq Shift Right Long Immediate With MQ 
srlq Shift Right Long With MQ 
srq Shift Right With MQ 

(*) This instruction is privileged. 

Note: Many of these instructions use the MQ register. The MQ is not 
defined in the PowerPC Architecture. 



Assembler Note 

It might be helpful to 
current software writers 
for the Assembler to flag 
the discontinued POWER 
instructions. 



G.28 Discontinued Opcodes 

The opcodes listed below are defined in the POWER Architecture but 
have been dropped from the PowerPC Architecture. The list contains the 
POWER mnemonic (MNEM), the primary opcode (PRI), and the 
extended opcode (XOP) if appropriate. The corresponding instructions 
are reserved in PowerPC. 



MNEM 


PRI 


XOP 


abs 


31 


360 


clcs 


31 


531 


clf 


31 


118 


cli{*) 


31 


502 


deist 


31 


630 


div 


31 


331 


divs 


31 


363 


doz 


31 


264 


dozi 


09 




Iscbx 


31 


277 


maskg 


31 


29 


maskir 


31 


541 


mfsri 


31 


627 


tnul 


31 


107 


nabs 


31 


488 


rac{*) 


31 


818 
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rfsvcC^) 


19 


82 


rlmi 


22 




nib 


31 


537 


sle 


31 


153 


sleq 


31 


217 


sliq 


31 


184 


slliq 


31 


248 


sllq 


31 


216 


slq 


31 


152 


sraiq 


31 


952 


sraq 


31 


920 


sre 


31 


665 


srea 


31 


921 


sreq 


31 


729 


sriq 


31 


696 


srliq 


31 


760 


srlq 


31 


728 


srq 


31 


664 



(*) This instruction is privileged. 



G.29 POWER2 Compatibility 

The POWER2 instruction set is a superset of the POWER instruction set. 
Some of the instructions added for POWER2 are included in the 
PowerPC Architecture. Those that have been renamed in the PowerPC 
Architecture are listed in this section, as are the new POWER2 instruc- 
tions that are not included in the PowerPC Architecture. 
Other incompatibilities are also listed. 

G.29.1 Cross-Reference for Changed POWER2 
Mnemonics 

The following table lists the new POWER2 instruction mnemonics that 
have been changed in the PowerPC User Instruction Set Architecture, 
sorted by POWER2 mnemonic. 
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To determine the PowerPC mnemonic for one of these POWER2 mne- 
monics, find the POWER2 mnemonic in the second column of the table: 
the remainder of the line gives the PowerPC mnemonic and the page on 
which the instruction is described, as well as the instruction names. 

POWER2 mnemonics that have not changed are not listed. 



Page 


POWER2 


PowerPC 


Mnemonic 


Instruction 


Mnemonic 


Instruction 


189 


fcir[.] 


Floating Convert Double to Integer with 
Round 


fctiw[.] 


Floating Convert To Integer Word 


190 


fcirz[.] 


Floating Convert Double to Integer with 
Round to Zero 


fctiwz[.] 


Floating Convert To Integer Word with 
round toward Zero 



G.29.2 Floating-Point Conversion to integer 

The fcir and fcirz instructions of POWER2 have the same opcodes as do 
the fctiw and fctiwz instructions, respectively, of PowerPC. However, the 
functions differ in the following respects. 

■ fcir and fcirz set the high-order 32 bits of the target FPR to 
OxFFFF_FFFF, while fctiw and fctiwz set them to an undefined value. 

■ Except for enabled Invalid Operation Exceptions, fcir and fcirz set the 
FPRF field of the FPSCR based on the result, while fctiw and fctiwz set 
it to an undefined value. 

■ fcir and fcirz do not affect the VXSNAN bit of the FPSCR, while fctiw 
and fctiwz do. 

■ fcir and fcirz set FPSCRxx to 1 for certain cases of "Large Operands" 
(i.e., operands that are too large to be represented as a 32-bit signed 
fixed-point integer), while fctiw and fctiwz do not alter it for any case 
of "Large Operand." (The IEEE standard requires not altering it for 
"Large Operands.") 

G.29.3 Storage Ordering 

POWER2 uses MSR bit 28 to control storage ordering. This bit is 
reserved in PowerPC, and no corresponding control is provided. 
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G.29.4 Floating-Point interrupts 

Both architectures use MSR bits 20 and 23 to control the generation of 
interrupts for floating-point enabled exceptions. However, in PowerPC 
these bits comprise a two-bit value which controls the occurrence, preci- 
sion, and recoverability of the interrupt, while in POWER2 these bits are 
used independently to control the occurrence (bit 20) and the precision 
(bit 23) of the interrupt. Moreover, in PowerPC all floating-point inter- 
rupts are considered Program interrupts, while in POWER2 imprecise 
floating-point interrupts have their own interrupt vector location. 

G.29.5 Trace interrupts 

The interrupt vector location differs between the two architectures. Also, 
the trace facility is optional in PowerPC but required in POWER2. 

G.29.6 Deieted Instructions 

The following instructions are new in the POWER2 Architecture but 
have been dropped from the PowerPC Architecture. 



Ifq 


Load Floating-Point Quad 


Ifqu 


Load Floating-Point Quad with Update 


Ifqux 


Load Floating-Point Quad with Update Indexed 


Ifqx 


Load Floating-Point Quad Indexed 


stfq 


Store Floating-Point Quad 


stfqu 


Store Floating-Point Quad with Update 


stfqux 


Store Floating-Point Quad with Update Indexed 


stfqx 


Store Floating-Point Quad Indexed 



G.29.7 Discontinued Opcodes 

The opcodes listed below are new in the POWER2 Architecture but have 
been dropped from the PowerPC Architecture. The list contains the 
POWER2 mnemonic (MNEM), the primary opcode (PRI), and the 
extended opcode (XOP) if appropriate. The corresponding instructions 
are reserved in PowerPC. 
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MNEM 


PRI 


XOP 


Ifq 


56 


_ 


Ifqu 


57 


_ 


Ifqux 


31 


823 


Ifax 


31 


791 


stfq 


60 




stfqu 


61 




stfqux 


31 


951 


stfqx 


31 


919 
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The following instructions in the PowerPC Architecture are new: they are 
not in the POWER Architecture. 

They are listed in three groups, according to whether they exist in all 
PowerPC implementations, only in 64-bit implementations, or only in 
32-bit implementations. 

The following instructions are optional: eciwx, ecowx, fres, frsqrfe, 
fsely fsqft[s], slbia, slbie, stfiwx, tibia, tlbsync. 

H.1 New Instructions for All 
Implementations 



dcbf 


Data Cache Block Flush 


debt 


Data Cache Block Invalidate 


dcbst 


Data Cache Block Store 


debt 


Data Cache Block Touch 


debtst 


Data Cache Block Touch for Store 


divw 


Divide Word 


divwu 


Divide Word Unsigned 


eeiwx 


External Control In Word Indexed 


eeowx 


External Control Out Word Indexed 


eieio 


Enforce In-order Execution of I/O 
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extsb Extend Sign Byte 

fadds Floating Add Single 

fctiw Floating Convert To Integer Word 

fctiwz Floating Convert To Integer Word with round toward Zero 

fdivs Floating Divide Single 

fmadds Floating Multiply- Add Single 

fmsubs Floating Multiply-Subtract Single 

fmuls Floating Multiply Single 

fnmadds Floating Negative Multiply- Add Single 

fnmsubs Floating Negative Multiply-Subtract Single 

fres Floating Reciprocal Estimate Single 

frsqrte Floating Reciprocal Square Root Estimate 

fsel Floating Select 

fsqrt[s] Floating Square Root [Single] 

fsubs Floating Subtract Single 

icbi Instruction Cache Block Invalidate 

Iwarx Load Word And Reserve Indexed 

mftb Move From Time Base 

mulhw Multiply High Word 

mulhwu Multiply High Word Unsigned 

stfiwx Store Floating-Point as Integer Word Indexed 

stwcx. Store Word Conditional Indexed 

subf Subtract From 

tibia TLB Invalidate All 

tlbsync TLB Synchronize 
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H.2 New Instructions for 64-Bit 
implementations Only 

cntlzd Count Leading Zeros Doubleword 

divd Divide Doubleword 

divdu Divide Doubleword Unsigned 

extstv Extend Sign Word 

fcfid Floating Convert From Integer Doubleword 

fetid Floating Convert To Integer Doubleword 

fctidz Floating Convert To Integer Doubleword with round toward 
Zero 

Iwa Load Word Algebraic 

Iwaux Load Word Algebraic with Update Indexed 

Iwax Load Word Algebraic Indexed 

Id Load Doubleword 

Idarx Load Doubleword And Reserve Indexed 

Idu Load Doubleword with Update 

Idux Load Doubleword with Update Indexed 

Idx Load Doubleword Indexed 

mulhd Multiply High Doubleword 

mulhdu Multiply High Doubleword Unsigned 

mulld Multiply Low Doubleword 

rldcl Rotate Left Doubleword then Clear Left 

rider Rotate Left Doubleword then Clear Right 

rldie Rotate Left Doubleword Immediate then Clear 

rldiel Rotate Left Doubleword Immediate then Clear Left 

rldier Rotate Left Doubleword Immediate then Clear Right 

rlditni Rotate Left Doubleword Immediate then Mask Insert 

slbia SLB Invalidate All 
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slbie 


SLB Invalidate Entry 


sld 


Shift Left Doubleword 


srad 


Shift Right Algebraic Doubleword 


sradi 


Shift Right Algebraic Doubleword Immediate 


srd 


Shift Right Doubleword 


std 


Store Doubleword 


stdcx. 


Store Doubleword Conditional Indexed 


stdu 


Store Doubleword with Update 


stdux 


Store Doubleword with Update Indexed 


stdx 


Store Doubleword Indexed 


td 


Trap Doubleword 


tdi 


Trap Doubleword Immediate 



H.3 New Instructions for 32-Bit 
Implementations Only 

mfsrin Move From Segment Register Indirect 
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Illegal Instructions 




With the exception of the instruction consisting entirely of binary Os, the 
instructions in this class are available for future extensions of the 
PowerPC Architecture: that is, some future version of the PowerPC 
Architecture may define any of these instructions to perform new func- 
tions. 

The following primary opcodes are illegal: 

1, 4, 5, 6, 56, 57, 60, 61 

In addition, the following primary opcodes are illegal for 32-bit imple- 
mentations (they are defined only for 64-bit implementations): 

2, 30, 58, 62 

The following primary opcodes have unused extended opcodes. 
Extended opcodes for instructions that are defined only for 64-bit imple- 
mentations are illegal in 32-bit implementations, and extended opcodes 
for instructions that are defined only for 32-bit implementations are ille- 
gal in 64-bit implementations. All unused extended opcodes are illegal. 

19, 30\ 31, 59, 62\ 63 

An instruction consisting entirely of binary Os is illegal, and is guaran- 
teed to be illegal in all future versions of this architecture. 



^ Applies only for 64-bit implementations (illegal primary opcode for 32- 
bit implementations) 



Reserved Instructions 




The instructions in this class are allocated to specific purposes that are 
outside the scope of the PowerPC User Instruction Set Architecture, 
PowerPC Virtual Environment Architecture, and PowerPC Operating 
Environment Architecture. 

The following types of instruction are included in this class: 

1- The instruction having primary opcode 0, except the instruction con- 
sisting entirely of binary Os (which is an illegal instruction: see Section 
Section 1.8.2, "Illegal Instruction Class," on page 24). 

2. Instructions for the POWER Architecture that have not been included 
in the PowerPC Architecture. These are listed in Section G.28, "Dis- 
continued Opcodes," on page 282 and Section G.29.7, "Discontinued 
Opcodes," on page 285. 

3. Implementation-specific instructions used to conform to the PowerPC 
Architecture specifications. 

4. Any other instructions contained in the Book IV, PowerPC Implemen- 
tation Features Document for any implementation, that are not 
defined in the PowerPC User Instruction Set Architecture, PowerPC 
Virtual Environment Architecture, or PowerPC Operating Environ- 
ment Architecture. 



PowerPC 
Instruction Set 
Sorted by Opcode 




This appendix lists all the instructions in the PowerPC Architecture, in 
order by opcode. The Book number is shown in the "Page" column for 

instructions that are not defined in Book 1. 



Form 


Opcode 


Mode 
Dep.l 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


D 


2 




0 


102 


tdi 


Trap Doubleword Immediate 


D 


3 






103 


twi 


Trap Word Immediate 


D 


7 






90 


muUi 


Multiply Low Immediate 


D 


8 




SR 


85 


subfic 


Subtract From Immediate Carrying 


D 


10 






100 


cmpli 


Compare Logical Immediate 


D 


11 






99 


cmpi 


Compare Immediate 


D 


12 




SR 


84 


addic 


Add Immediate Carrying 


D 


13 




SR 


84 


addic. 


Add Immediate Carrying and Record 


D 


14 






82 


addi 


Add Immediate 


D 


15 






82 


addis 


Add Immediate Shifted 


B 


16 




CT 


38 


bc[l][a] 


Branch Conditional 
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Form 


Opcode 


Mode 
Dep.^ 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


SC 


17 






41 


SC 


System Call (see also Book III, page 378) 


I 


18 






38 


b[l][a] 


Branch 


XL 


19 


0 




46 


mcrf 


Move Condition Register Field 


XL 


19 


16 


CT 


39 


bclr[l] 


Branch Conditional to Link Register 


XL 


19 


33 




44 


crnor 


Condition Register NOR 


XL 


19 


50 




379 (III) 


rfi 


Return From Interrupt 


XL 


19 


129 




45 


crandc 


Condition Register AND with Complement 


XL 


19 


150 




346 (II) 


isync 


Instruction Synchronize 


XL 


19 


193 




43 


crxor 


Condition Register XOR 


XL 


19 


225 




43 


crnand 


Condition Register NAND 


XL 


19 


257 




42 


crand 


Condition Register AND 


XL 


19 
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44 


creqv 


Condition Register Equivalent 


XL 


19 


417 




45 


crorc 


Condition Register OR with Complement 


XL 


19 


449 




42 


cror 


Condition Register OR 


XL 


19 


528 


CT 


40 


bcctr[l] 


Branch Conditional to Count Register 


M 


20 




SR 


122 


rlwimi[.] 


Rotate Left Word Immediate then Mask Insert 


M 


21 




SR 


119 


rlwinm[.] 


Rotate Left Word Immediate then AND with Mask 


M 


23 




SR 


121 


rlwnm[.] 


Rotate Left Word then AND with Mask 


D 


24 






107 


ori 


OR Immediate 


D 


25 






107 


oris 


OR Immediate Shifted 


D 


26 






108 


xori 


XOR Immediate 


D 


27 






108 


xoris 


XOR Immediate Shifted 


D 


28 




SR 


106 


andi. 


AND Immediate 


D 


29 




SR 


106 


andis. 


AND Immediate Shifted 
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Form 


Opcode 


Mode 
uep. 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


MD 


30 


0 


(SR) 


116 


rldicl[.] 


Rotate Left Doubleword Immediate then Clear Left 


MD 


30 


1 


(SR) 


117 


rldicr[.] 


Rotate Left Doubleword Immediate then Clear Right 


MD 


30 


2 


(SR) 


118 


rldic[.] 


Rotate Left Doubleword Immediate then Clear 


MD 


30 


3 


(SR) 


121 


rldimi[.] 


Rotate Left Doubleword Immediate then Mask Insert 


MDS 


30 


8 


(SR) 


119 


rldcl[.] 


Rotate Left Doubleword then Clear Left 


MDS 


30 


9 


(SR) 


120 


rldcr[.] 


Rotate Left Doubleword then Clear Right 


X 


31 


0 




99 


cmp 


Compare 


X 


31 


4 




105 


tw 


Trap Word 


xo 


31 


8 


SR 


86 


subfc[o][.] 


Subtract From Carrying 


xo 


31 


9 


(SR) 


92 


mulhdu[.] 


Multiply High Doubleword Unsigned 


xo 


31 


10 


SR 


85 


addc[o][.] 


Add Carrying 


xo 


31 


11 


SR 


93 


mulhwu[.] 


Multiply High Word Unsigned 


X 


31 


19 




132 


mfcr 


Move From Condition Register 


X 


31 


20 




77 


Iwarx 


Load Word And Reserve Indexed 


X 


31 


21 


0 


59 


Idx 


Load Doubleword Indexed 


X 


31 


23 




56 


Iwzx 


Load Word and Zero Indexed 


X 


31 


24 


SR 


121 


slw[.] 


Shift Left Word 


X 


31 


26 


SR 


114 


cntlzw[.] 


Count Leading Zeros Word 


X 


31 


27 


(SR) 


123 


sld[.] 


Shift Left Doubleword 


X 


31 


28 


SR 


109 


and[.] 


AND 


X 


31 


32 




101 


cmpl 


Compare Logical 


xo 


31 


40 


SR 


83 


subf[o][.] 


Subtract From 


X 


31 


53 


0 


60 


Idux 


Load Doubleword with Update Indexed 


X 


31 


54 




348 (II) 


dcbst 


Data Cache Block Store 
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Form 


Opcode 


Mode 
Dep.l 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


X 


31 


55 




57 


Iwzux 


Load Word and Zero with Update Indexed 


X 


31 


58 


(SR) 


114 


cntlzd[.] 


Count Leading Zeros Doubleword 


X 


31 


60 


SR 


111 


andc[.] 


AND with Complement 


X 


31 


68 


0 


104 


td 


Trap Doubleword 


xo 


31 


73 


(SR) 


91 


mulhd[.] 


Multiply High Doubleword 


xo 


31 


75 


SR 


92 


mulhw[.] 


Multiply High Word 


X 


31 


83 




389 (III) 


mfmsr 


Move From Machine State Register 


X 


31 


84 


0 


77 


Idarx 


Load Doubleword And Reserve Indexed 


X 


31 


86 




349 (II) 


dcbf 


Data Cache Block Flush 


X 


31 


87 




50 


Ibzx 


Load Byte and Zero Indexed 


xo 


31 


104 


SR 


89 


neg[o][.] 


Negate 


X 


31 


119 




51 


Ibzux 


Load Byte and Zero with Update Indexed 


X 


31 


124 


SR 


110 


nor[.] 


NOR 


xo 


31 


136 


SR 


87 


subfe[o][.] 


Subtract From Extended 


xo 


31 


138 


SR 


86 


adde[o][.] 


Add Extended 


XFX 


31 


144 




131 


mtcrf 


Move To Condition Register Fields 


X 


31 


146 




389 (III) 


mtmsr 


Move To Machine State Register 


X 


31 


149 


0 


67 


stdx 


Store Doubleword Indexed 


X 


31 


150 




78 


stwcx. 


Store Word Conditional Indexed 


X 


31 


151 




65 


stwx 


Store Word Indexed 


X 


31 


181 


0 


68 


stdux 


Store Doubleword with Update Indexed 


X 


31 


183 




66 


stwux 


Store Word with Update Indexed 


xo 


31 


200 


SR 


89 


subfze[o][.] 


Subtract From Zero Extended 


xo 


31 


202 


SR 


88 


addze[o][.] 


Add to Zero Extended 
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Form 


Opcode 


Mode 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


Dep.^ 


X 


31 


210 


{} 


440 (III) 


mtsr 


Move To Segment Register 


X 


31 


214 


0 


79 


stdcx. 


Store Doubleword Conditional Indexed 


X 


31 


215 




61 


stbx 


Store Byte Indexed 


xo 


31 


232 


SR 


88 


subfme[o][.] 


Subtract From Minus One Extended 


xo 


31 


233 


(SR) 


90 


mulld[o][.] 


Multiply Low Doubleword 


xo 


31 


234 


SR 


87 


addme[o][.] 


Add to Minus One Extended 


xo 


31 


235 


SR 


91 


mullw[o][.] 


Multiply Low Word 


X 


31 


242 


{} 


441 (III) 


mtsrin 


Move To Segment Register Indirect 


X 


31 


246 




347 (II) 


dcbtst 


Data Cache Block Touch for Store 


X 


31 


247 




62 


stbux 


Store Byte with Update Indexed 


xo 


31 


266 


SR 


83 


add[o][.] 


Add 


X 


31 


278 




346 (II) 


debt 


Data Cache Block Touch 


X 


31 


279 




52 


Ihzx 


Load Halfword and Zero Indexed 


X 


31 


284 


SR 


111 


eqv[.] 


Equivalent 


X 


31 


306 




444 (III) 


tibie 


TLB Invalidate Entry 


X 


31 


310 




491 (III) 


eciwx 


External Control In Word Indexed 


X 


31 


311 




53 




T oad T-IfllfwnrH and 7,prn wi'tb TTndatp Tndpvpd 


X 


31 


316 


SR 


110 


xorF.l 


XOR 


XFX 


31 


339 




130 


mfspr 


M^ove From Special Purpose Register (see also Book III, 
page 387) 


X 


31 


341 


0 


58 


Iwax 


Load Word Algebraic Indexed 


X 


31 


343 




54 


ihax 


Load Halfword Algebraic Indexed 


X 


31 


370 




445 (III) 


tibia 


TLB Invalidate All 


XFX 


31 


371 




352 (II) 


mftb 


Move From Time Base 


X 


31 


373 


0 


58 


Iwaux 


Load Word Algebraic with Update Indexed 
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opcode 


Mode 








Form 






Page 


Mnemonic 


Instruction 


Primary 


Extended 


Dep 1 


X 


31 


375 




55 


lhaux 


Load Halfword Algebraic with Update Indexed 


X 


31 


407 




63 


sthx 


Store Halfword Indexed 


X 


31 


412 


SR 


112 


orc[.] 


OR with Complement 


xs 


31 


413 


(SR) 


126 


sradi[.] 


Shift Right Algebraic Doubleword Immediate 


X 


31 


434 


0 


443 (III) 


slbie 


SLB Invalidate Entry 


X 


31 


438 




492 (III) 


ecowx 


External Control Out Word Indexed 


X 


31 


439 




64 


sthux 


Store Halfword with Update Indexed 


X 


31 


444 


SR 


109 


or[.] 


OR 


xo 


31 


457 


fSR) 


96 


divdiiFoir 1 


Divide Doubleword Unsigned 


xo 


31 


459 


SR 


97 


divwu[o][.] 


Divide W^ord Unsigned 


XFX 


31 


467 




129 


mtspr 


M^ove To Special Purpose Register (see also Book III, 
page 384) 


X 


31 


470 




439 (III) 


dcbi 


Data Cache Block Invalidate 


X 


31 


476 


SR 


110 


nand[.] 


NAND 


xo 


31 


489 


(SR) 


94 


divd[o][.] 


Divide Doubleword 


xo 


31 


491 


SR 


95 


divw[o][.] 


Divide Word 


X 


31 


498 


0 


444 (III) 


sibia 


SLB Invalidate All 


X 


31 


512 




132 


mcrxr 


Move to Condition Register from XER 


X 


31 


533 




74 


Iswx 


Load String Word Indexed 


X 


31 


534 




69 


Iwbrx 


Load Word Byte-Reverse Indexed 


X 


31 


535 




170 


Ifsx 


Load Floating-Point Single Indexed 


X 


31 


536 


SR 


125 


srw[.] 


Shift Right Word 


X 


31 


539 


(SR) 


124 


srd[.] 


Shift Right Doubleword 


X 


31 


566 




445 (III) 


tlbsync 


TLB Synchronize 


X 


31 


567 




171 


Ifsux 


Load Floating-Point Single with Update Indexed 
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Form 


opcode 


Mode 
uep. 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


X 


31 


595 


{} 


441 (III) 


mfsr 


Move From Segment Register 


X 


31 


597 




73 


Iswi 


Load String Word Immediate 


X 


31 


598 




80 


sync 


Synchronize 


X 


31 


599 




172 


Ifdx 


Load Floating-Point Double Indexed 


X 


31 


631 




172 


Ifdux 


Load Floating-Point Double with Update Indexed 


X 


31 


659 


{} 


442 (III) 


mfsrin 


Move From Segment Register Indirect 


X 


31 


661 




76 


stswx 


Store String Word Indexed 


X 


31 


662 




70 


stwbrx 


Store Word Byte-Reverse Indexed 


X 


31 


663 




175 


stfsx 


Store Floating-Point Single Indexed 


X 


31 


695 




175 


stfsux 


Store Floating-Point Single with Update Indexed 


X 


31 


725 




75 


stswi 


Store String Word Immediate 


X 


31 


727 




176 


stfdx 


Store Floating-Point Double Indexed 


X 


31 


759 




177 


stfdux 


Store Floating-Point Double with Update Indexed 


X 


31 


790 




68 


Ihbrx 


Load Halfword Byte-Reverse Indexed 


X 


31 


792 


SR 


128 


sraw[.] 


Shift Right Algebraic Word 


X 


31 


794 


(SR) 


127 


srad[.] 


Shift Right Algebraic Doubleword 


X 


31 


824 


SR 


126 


srawi[.] 


Shift Right Algebraic Word Immediate 


X 


31 


854 




350 (II) 


eieio 


Enforce In-order Execution of I/O 


X 


31 


918 




69 


sthbrx 


Store Halfword Byte-Reverse Indexed 


X 


31 


922 


SR 


113 


extsh[.] 


Extend Sign Halfword 


X 


31 


954 


SR 


112 


extsb[.] 


Extend Sign Byte 


X 


31 


982 




345 (II) 


icbi 


Instruction Cache Block Invalidate 


X 


31 


983 




198 


stfiwx 


Store Floating-Point as Integer Word Indexed 


X 


31 


986 


(SR) 


113 


extsw[.] 


Extend Sign Word 
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Form 


Opcode 


Mode 

LJcp. 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


X 


31 


1014 




347 (II) 


dcbz 


Data Cache Block set to Zero 


D 


32 






55 


Iwz 


Load Word and Zero 


D 


33 






56 


Iwzu 


Load Word and Zero with Update 


D 


34 






50 


Ibz 


Load Byte and Zero 


D 


35 






51 


Ibzu 


Load Byte and Zero with Update 


D 


36 






64 


stw 


Store Word 


D 


37 






65 


stwu 


Store Word with Update 


D 


38 






61 


stb 


Store Byte 


D 


39 






62 


stbu 


Store Byte with Update 


D 


40 






52 


Ihz 


Load Halfword and Zero 


D 


41 






53 


Ihzu 


Load Halfword and Zero with Update 


D 


42 






54 


lha 


Load Halfword Algebraic 


D 


43 






54 


lhau 


Load Halfword Algebraic with Update 


D 


44 






63 


sth 


Store Halfword 


D 


45 






63 


sthu 


Store Halfword with Update 


D 


46 






71 


Imw 


Load Multiple Word 


D 


47 






72 


stmw 


Store Multiple Word 


D 


48 






169 


Ifs 


Load Floating-Point Single 


D 


49 






170 


Ifsu 


Load Floating-Point Single with Update 


D 


50 






171 


Ifd 


Load Floating-Point Double 


D 


51 






172 


Ifdu 


Load Floating-Point Double with Update 


D 


52 






174 


stfs 


Store Floating-Point Single 


D 


53 






175 


stfsu 


Store Floating-Point Single with Update 


D 


54 






176 


stfd 


Store Floating-Point Double 
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Opcode 


Mode 








Form 






Page 


Mnemonic 


Instruction 


Primary 


Extended 


jjep. 


D 


55 






177 


stfdu 


Store Floating-Point Double with Update 


DS 


58 


0 


0 


59 


Id 


Load Doubleword 


DS 


58 


1 


0 


60 


Idu 


Load Doubleword with Update 


DS 


58 


2 


0 


57 


Iwa 


Load Word Algebraic 


A 


59 


18 




182 


fdivs[.] 


Floating Divide Single 


A 


59 


20 




180 


fsubs[.] 


Floating Subtract Single 


A 


59 


21 




179 


fadds[.] 


Floating Add Single 


A 


59 


22 




198 


fsqrts[.] 


Floating Square Root Single 


A 


59 


24 




200 


fres[.] 


Floating Reciprocal Estimate Single 


A 


59 


25 




181 


fmuls[.] 


Floating Multiply Single 


A 


59 


28 




184 


fmsubs[.] 


Floating Multiply-Subtract Single 


A 


59 


29 




183 


fmadds[.] 


Floating Multiply- Add Single 


A 


59 


30 




186 


fnmsubs[.] 


Floating Negative Multiply-Subtract Single 


A 


59 


31 




185 


fnmadds[.] 


Floating Negative Multiply- Add Single 


DS 


62 


0 


0 


66 


std 


Store Doubleword 


DS 


62 


1 


0 


67 


stdu 


Store Doubleword with Update 


X 


63 


0 




191 


icmpu 


Floating Compare Unordered 


X 


63 


12 




187 




FlnafintT Rnnnn ta SinP'lp-Prpri^inn 


X 


63 


14 




189 


frfiwF 1 


Fin^ifTno' C^onvprt" To Tntpp'pr ^VXZnrH 


X 


63 


15 




190 


fctiwz[.] 


Plnjil'ino' (^nnvprf Tn Tnfpp'pi* \X7nt*n with rnnnn towjirn 
Zero 


A 


63 


18 




182 


fdiv[.] 


Floating Divide 


A 


63 


20 




180 


fsub[.] 


Floating Subtract 


A 


63 


21 




179 


fadd[.] 


Floating Add 


A 


63 


22 




198 


fsqrt[.] 


Floating Square Root 
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Opcode 


Mode 








Form 






Page 


Mnemonic 


Instruction 


Primary 


Extended 


Dep.^ 


A 


63 


23 




202 


fsel[.] 


Floating Select 


A 


63 


25 




181 


fmul[.] 


Floating Multiply 


A 


63 


26 




201 


frsqrte[.] 


Floating Reciprocal Square Root Estimate 


A 


63 


28 




184 


fmsub[.] 


Floating Multiply-Subtract 


A 


63 


29 




183 


fmadd[.] 


Floating Multiply- Add 


A 


63 


30 




186 


fnmsub[.] 


Floating Negative Multiply-Subtract 


A 


63 


31 




185 


fnmadd[.] 


Floating Negative Multiply- Add 


X 


63 


32 




192 


fcmpo 


Floating Compare Ordered 


X 


63 


38 




196 


mtfsbl[.] 


Move To FPSCR Bit 1 


X 


63 


40 




178 


fneg[.] 


Floating Negate 


X 


63 


64 




194 


mcrfs 


Move to Condition Register from FPSCR 


X 


63 


70 




195 


mtfsbO[.] 


Move To FPSCR Bit 0 


X 


63 


72 




178 


fmr[.] 


Floating Move Register 


X 


63 


134 




194 


mtfsfi[.] 


Move To FPSCR Field Immediate 


X 


63 


136 




179 


fnabs[.] 


Floating Negative Absolute Value 


X 


63 


264 




178 


fabs[.] 


Floating Absolute Value 


X 


63 


583 




194 


mffs[.] 


Move From FPSCR 


XFL 


63 


711 




195 


mtfsf[.] 


Move To FPSCR Fields 


X 


63 


814 


0 


187 


fctid[.] 


Floating Convert To Integer Doubleword 


X 


63 


815 


0 


188 


fctidz[.] 


Floating Convert To Integer Doubleword with round 
toward Zero 


X 


63 


846 


0 


190 


fcfid[.] 


Floating Convert From Integer Doubleword 



See key to mode dependency column, on page 315. 
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Form 


Opcode 


Mode 
Dep.l 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


XO 


31 


266 


SR 


83 


add[o][.] 


Add 


XO 


31 


10 


SR 


85 


addc[o][.] 


Add Carrying 


XO 


31 


138 


SR 


86 


adde[o][.] 


Add Extended 


D 


14 






82 


addi 


Add Immediate 


D 


12 




SR 


84 


addic 


Add Immediate Carrying 


D 


13 




SR 


84 


addic. 


Add Immediate Carrying and Record 


D 


15 






82 


addis 


Add Immediate Shifted 


XO 


31 


234 


SR 


87 


addme[o][.] 


Add to Minus One Extended 


XO 


31 


202 


SR 


88 


addze[o][.] 


Add to Zero Extended 


X 


31 


28 


SR 


109 


and[.] 


AND 


X 


31 


60 


SR 


111 


andc[.] 


AND with Complement 
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Form 


opcode 


Mode 




Mnemonic 


Instruction 


Primary 


Extended 


Dep.l 


Page 


D 


28 




SR 


106 


andi. 


AND Immediate 


D 


29 




SR 


106 


andis. 


AND Immediate Shifted 


I 


18 






38 


b[l][a] 


Branch 


B 


16 




CT 


38 


bc[l][a] 


Branch Conditional 


XL 


19 


528 


CT 


40 


bcctr[l] 


Branch Conditional to Count Register 


XL 


19 


16 


CT 


39 


bclr[l] 


Branch Conditional to Link Register 


X 


31 


0 




99 


cmp 


Compare 


D 


11 






99 


cmpi 


Compare Immediate 


X 


31 


32 




101 


cmpl 


Compare Logical 


D 


10 






100 


cmpli 


Compare Logical Immediate 


X 


31 


58 


(SR) 


114 


cntlzd[.] 


Count Leading Zeros Doubleword 


X 


31 


26 


SR 


114 


cntlzw[.] 


Count Leading Zeros Word 


XL 


19 


257 




42 


crand 


Condition Register AND 


XL 


19 


129 




45 


crandc 


Condition Register AND with Complement 


XL 


19 


289 




44 


creqv 


Condition Register Equivalent 


XL 


19 


225 




43 


crnand 


Condition Register NAND 


XL 


19 


33 




44 


crnor 


Condition Register NOR 


XL 


19 


449 




42 


cror 


Condition Register OR 


XL 


19 


417 




45 


crorc 


Condition Register OR with Complement 


XL 


19 


193 




43 




CnnAitinn Rpo-i«t*»r X^OR 


X 


31 


86 




349 (II) 


dcbf 


Data Cache Block Flush 


X 


31 


470 




439 (III) 


dcbi 


Data Cache Block Invalidate 


X 


31 


54 




348 (II) 


dcbst 


Data Cache Block Store 
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Form 


opcode 


Mode 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


Dep.l 


X 


31 


278 




346 (II) 


debt 


Data Cache Block Touch 


X 


31 


246 




347 (II) 


dcbtst 


Data Cache Block Touch for Store 


X 


31 


1014 




347 (II) 


dcbz 


Data Cache Block set to Zero 


xo 


31 


489 


(SR) 


94 


divd[o][.] 


Divide Doubleword 


xo 


31 


457 


(SR) 


96 


divdu[o][.] 


Divide Doubleword Unsigned 


xo 


31 


491 


SR 


95 


divw[o][.] 


Divide Word 


xo 


31 


459 


SR 


97 


divwu[o][.] 


Divide Word Unsigned 


X 


31 


310 




491 (III) 


eciwx 


External Control In Word Indexed 


X 


31 


438 




492 (III) 


ecowx 


External Control Out Word Indexed 


X 


31 


854 




350 (II) 


eieio 


Enforce In-order Execution of I/O 


X 


31 


284 


SR 


111 


eqv[.] 


Equivalent 


X 


31 


954 


SR 


112 


extsb[.] 


Extend Sign Byte 


X 


31 


922 


SR 


113 


extsh[.] 


Extend Sign Halfword 


X 


31 


986 


(SR) 


113 


extsw[.] 


Extend Sign Word 


X 


63 


264 




178 


fabs[.] 


Floating Absolute Value 


A 


63 


21 




179 


fadd[.] 


Floating Add 


A 


59 


21 




179 


fadds[.] 


Floating Add Single 


X 


63 


846 


0 


190 


fcfid[.] 


Floating Convert From Integer Doubleword 


X 


63 


32 




192 


fcmpo 


Floating Compare Ordered 


X 


63 


0 




191 


fcmpu 


Floating Compare Unordered 


X 


63 


814 


0 


187 


fctid[.] 


Floating Convert To Integer Doubleword 


X 


63 


815 


0 


188 


fctidz[.] 


Floating Convert To Integer Doubleword with round to- 
ward Zero 


X 


63 


14 




189 


fctiw[.] 


Floating Convert To Integer Word 
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Opcode 


Mode 








Form 






Page 


Mnemonic 


Instruction 


Primary 


Extended 


Dep} 


X 


63 


15 




190 


■fffiwyF 1 

J.VIX W<£>|_ • J 


Floating Convert To Integer Word with round toward 
Zero 


A 


63 


18 




182 


fdiv[.] 


Floating Divide 


A 


59 


18 




182 


fdivs[.] 


Floating Divide Single 


A 


63 


29 




183 


fmadd[.] 


Floating Multiply- Add 


A 


59 


29 




183 


fmadds[.] 


Floating Multiply- Add Single 


X 


63 


72 




178 


fmr[.] 


Floating Move Register 


A 


63 


28 




184 


fmsub[.] 


Floating Multiply-Subtract 


A 


59 


28 




184 


fmsubs[.] 


Floating Multiply-Subtract Single 


A 


63 


25 




181 


fmul[.] 


Floating Multiply 


A 


59 


25 




181 


fmuls[.] 


Floating Multiply Single 


X 


63 


136 




179 


fnabs[.] 


Floating Negative Absolute Value 


X 


63 


40 




178 


fneg[.] 


Floating Negate 


A 


63 


31 




185 


fnmadd[.] 


Floating Negative Multiply- Add 


A 


59 


31 




185 


fnmadds[.] 


Floating Negative Multiply- Add Single 


A 


63 


30 




186 


fnmsub[.] 


Floating Negative Multiply-Subtract 


A 


59 


30 




186 


fnmsubs[.] 


Floating Negative Multiply-Subtract Single 


A 


59 


24 




200 


fres[.] 


Floating Reciprocal Estimate Single 


X 


63 


12 




187 


frsp[.] 


Floating Round to Single-Precision 


A 


63 


26 




201 


frsqrte[.] 


Floating Reciprocal Square Root Estimate 


A 


63 


23 




202 


fsel[.] 


Floating Select 


A 


63 


22 




198 


fsqrt[.] 


Floating Square Root 


A 


59 


22 




198 


fsqrts[.] 


Floating Square Root Single 


A 


63 


20 




180 


fsub[.] 


Floating Subtract 
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Form 


Opcode 


Mode 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


Dep.^ 


A 


59 


20 




180 


fsubs[.] 


Floating Subtract Single 


X 


31 


982 




345 (II) 


icbi 


Instruction Cache Block Invalidate 


XL 


19 


150 




346 (II) 


isync 


Instruction Synchronize 


D 


34 






50 


Ibz 


Load Byte and Zero 


D 


35 






51 


Ibzu 


Load Byte and Zero with Update 


X 


31 


119 




51 


Ibzux 


Load Byte and Zero with Update Indexed 


X 


31 


87 




50 


Ibzx 


Load Byte and Zero Indexed 


DS 


58 


0 




59 


Id 


Load Doubleword 


X 


31 


84 




77 


Idarx 


Load Doubleword And Reserve Indexed 


DS 


58 


1 




60 


Idu 


Load Doubleword with Update 


X 


31 


53 




60 


Idux 


Load Doubleword with Update Indexed 


X 


31 


21 




59 


Idx 


Load Doubleword Indexed 


D 


50 






171 


Ifd 


Load Floating-Point Double 


D 


51 






172 


Ifdu 


Load Floating-Point Double with Update 


X 


31 


631 




172 


Ifdux 


Load Floating-Point Double with Update Indexed 


X 


31 


599 




172 


Ifdx 


Load Floating-Point Double Indexed 


D 


48 






169 


ifs 


Load Floating-Point Single 


D 


49 






170 


Ifsu 


Load Floating-Point Single with Update 


X 


31 


567 




171 


Ifsux 


Load Floating-Point Single with Update Indexed 


X 


31 


535 




170 


Ifsx 


Load Floatmg-Pomt Smgle Indexed 


D 


42 






54 


lha 


Load Halfword Algebraic 


D 


43 






54 


lhau 


Load Halfword Algebraic with Update 


X 


31 


375 




55 


lhaux 


Load Halfword Algebraic with Update Indexed 
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Form 


Opcode 


Mode 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


Dep.l 


X 


31 


343 




54 


lhax 


Load Hairword Algebraic Indexed 


X 


31 


790 




68 


Ihbrx 


Load Halfword Byte-Reverse Indexed 


D 


40 






52 


Ihz 


Load Haliword and Zero 


D 


41 






53 


Ihzu 


Load Hairword and Zero with Update 


X 


31 


311 




53 


Ihzux 


T iTTir 1 1 fry "iTTl Tl 1 

Load Hairword and Zero with Update Indexed 


X 


31 


279 




52 


Ihzx 


Load Halfword and Zero Indexed 


D 


46 






71 


Imw 


Load Multiple Word 


X 


31 


597 




73 


Iswi 


Load String Word Immediate 


X 


31 


533 




74 


Iswx 


Load String Word Indexed 


DS 


58 


2 


0 


57 


Iwa 


Load Word Algebraic 


X 


31 


20 




77 


Iwarx 


Load Word And Reserve Indexed 


X 


31 


373 


0 


58 


Iwaux 


Load Word Algebraic with Update Indexed 


X 


31 


341 


0 


58 


Iwax 


Load Word Algebraic Indexed 


X 


31 


534 




69 


Iwbrx 


Load Word Byte-Reverse Indexed 


D 


32 






55 


Iwz 


Load Word and Zero 


D 


33 






56 


Iwzu 


Load Word and Zero with Update 


X 


31 


55 




57 


Iwzux 


Load Word and Zero with Update Indexed 


X 


31 


23 




56 


Iwzx 


Load Word and Zero Indexed 


XL 


19 


0 




46 


mcrf 


Move Condition Register Field 


X 


63 


64 




194 






X 


31 


512 




132 


mcrxr 


Move to Condition Register from XER 


X 


31 


19 




132 


mfcr 


Move From Condition Register 


X 


63 


583 




194 


mffs[.] 


Move From FPSCR 
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opcode 










Form 






Page 


Mnemonic 


Instruction 


Primary 


Extended 


Dep.l 


V 

A. 


31 


83 




OQQ /TTT\ 


mfmsr 


Move From Machine State Register 


XFX 


31 


339 




130 


mfspr 


Move From Special Purpose Register (see also Book III, 

page 387) 


X 


31 


595 


{} 


441 (III) 


mfsr 


Move From Segment Register 


X 


31 


659 


{} 


442 (III) 


mfsrin 


Move From Segment Register Indirect 


XFX 


31 


371 




352 (II) 


mftb 


Move From Time Base 


XFX 


31 


144 




131 


mtcrf 


Move To Condition Register Fields 


X 


63 


70 




195 


mtfsb0[.] 


Move To FPSCR Bit 0 


X 


DO 


JO 




196 


mtfsbl[.] 


Move To FPSCR Bit 1 


XFL 


63 


711 




195 


mtfsf[.] 


Move To FPSCR Fields 


X 


63 


134 




194 


Iil.Llsll|_ . J 


Mnvf Tn FP^iPR Fi'pIH Tmmprliii1-p 

XVIUVC i U n O VjIV I/ICIU. llllIllCUldLC 


A 


31 


146 




^QQ /TTT\ 


mtmsr 


Move To Machine State Register 


XFX 


31 


467 




129 


mtspr 


Move To Special Purpose Register (see also Book III, 
page 384 ) 


X 


31 


210 


{} 


440 (III) 


mtsr 


Move To Segment Register 


X 


31 


242 


{} 


441 (III) 


mtsrin 


Move To Segment Register Indirect 


xo 


31 


73 


(SR) 


91 


mulhd[.] 


Multiply High Doubleword 


xo 


31 


9 


(SR) 


92 


muihdu[.] 


Multiply High Doubleword Unsigned 


xo 


31 


75 


SR 


92 


mulhw[.] 


Multiply High Word 


xo 


31 


11 


SR 


93 


mulhwu[.] 


Multiply High Word Unsigned 


xo 


31 


233 


(SR) 


90 


mulld[o][.] 


Multiply Low Doubleword 


D 


7 






90 


muUi 


Multiply Low Immediate 


xo 


31 


235 


SR 


91 


mullw[o][.] 


Multiply Low Word 


X 


31 


476 


SR 


110 


nand[.] 


NAND 


xo 


31 


104 


SR 


89 


neg[o]t.] 


Negate 
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Form 


opcode 


Mode 
uep. 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


X 


31 


124 


SR 


110 


nor[.] 


NOR 


X 


31 


444 


SR 


109 


or[.] 


OR 


X 


31 


412 


SR 


112 


orc[.] 


OR with Complement 


D 


24 






107 


ori 


OR Immediate 


D 


25 






107 


oris 


OR Immediate Shifted 


XL 


19 


50 




379 (III) 


rfi 


Return From Interrupt 


MDS 


30 


8 


(SR) 


119 


rldcl[.] 


Rotate Left Doubleword then Clear Left 


MDS 


30 


9 


(SR) 


120 


rider [.] 


Rotate Left Doubleword then Clear Right 


MD 


30 


2 


(SR) 


118 


rldic[.] 


Rotate Left Doubleword Immediate then Clear 


MD 


30 


0 


(SR) 


116 


rldicl[.] 


Rotate Left Doubleword Immediate then Clear Left 


MD 


30 


1 


(SR) 


117 


rldicr[.] 


Rotate Left Doubleword Immediate then Clear Right 


MD 


30 


3 


(SR) 


121 


rldimi[.] 


Rotate Left Doubleword Immediate then Mask Insert 


M 


20 




SR 


122 


rlwimi[.] 


Rotate Left Word Immediate then Mask Insert 


M 


21 




SR 


119 


rlwinm[.] 


Rotate Left Word Immediate then AND with Mask 


M 


23 




SR 


121 


rlwnm[.] 


Rotate Left Word then AND with Mask 


SC 


17 






41 


SC 


System Call (see also Book III, page 378) 


X 


31 


498 


0 


444 (III) 


slbia 


SLB Invalidate All 


X 


31 


434 


0 


443 (III) 


slbie 


SLB Invalidate Entry 


X 


31 


27 


(SR) 


123 


sld[.] 


Shift Left Doubleword 


X 


31 


24 


SR 


124 


slw[.] 


Shift Left Word 


X 


31 


794 


(SR) 


127 


srad[.] 


Shift Right Algebraic Doubleword 


xs 


31 


413 


(SR) 


126 


sradi[.] 


Shift Right Algebraic Doubleword Immediate 


X 


31 


792 


SR 


128 


sraw[.] 


Shift Right Algebraic Word 


X 


31 


824 


SR 


126 


srawi[.] 


Shift Right Algebraic Word Immediate 
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Form 


Opcode 


Mode 
Dep.^ 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


X 


31 


539 


(SR) 


124 


srd[.] 


Shift Right Doubleword 


X 


31 


536 


SR 


125 


srw[.] 


Shift Right Word 


D 


38 






61 


stb 


Store Byte 


D 


39 






62 


stbu 


Store Byte with Update 


X 


31 


247 




62 


stbux 


Store Byte with Update Indexed 


X 


31 


215 




61 


stbx 


Store Byte Indexed 


DS 


62 


0 




66 


std 


Store Doubleword 


X 


31 


214 




79 


stdcx. 


Store Doubleword Conditional Indexed 


DS 


62 


1 




67 


stdu 


Store Doubleword with Update 


X 


31 


181 




68 


stdux 


Store Doubleword with Update Indexed 


X 


31 


149 




67 


stdx 


Store Doubleword Indexed 


D 


54 






176 


stfd 


Store Floating-Point Double 


D 


55 






177 


stfdu 


Store Floating-Point Double with Update 


X 


31 


759 




177 


stfdux 


Store Floating-Point Double with Update Indexed 


X 


31 


727 




176 


stfdx 


Store Floating-Point Double Indexed 


X 


31 


983 




198 


stfiwx 


Store Floating-Point as Integer Word Indexed 


D 


52 






174 


stfs 


Store Floating-Point Single 


D 


53 






175 


stfsu 


Store Floating-Point Single with Update 


X 


31 


695 




175 


stfsux 


Store Floating-Point Single with Update Indexed 


X 


31 


663 




175 


stfsx 


Store Floating-Point Single Indexed 


D 


44 






63 


sth 


Store Halfword 


X 


31 


918 




69 


sthbrx 


Store Halfword Byte-Reverse Indexed 


D 


45 






63 


sthu 


Store Halfword with Update 


X 


31 


439 




64 


sthux 


Store Halfword with Update Indexed 
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Form 


Opcode 


Mode 
Dep.^ 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


X 


31 


407 




63 


sthx 


Store Halfword Indexed 


D 


47 






72 


stmw 


Store Multiple Word 


X 


31 


725 




75 


stswi 


Store String Word Immediate 


X 


31 


661 




76 


stswx 


Store String Word Indexed 


D 


36 






64 


stw 


Store Word 


X 


31 


662 




70 


stwbrx 


Store Word Byte-Reverse Indexed 


X 


31 


150 




78 


stwcx. 


Store Word Conditional Indexed 


D 


37 






65 


stwu 


Store Word with Update 


X 


31 


183 




66 


stwux 


Store Word with Update Indexed 


X 


31 


151 




65 


stwx 


Store Word Indexed 


xo 


31 


40 


SR 


83 


subf[o][.] 


Subtract From 


xo 


31 


8 


SR 


86 


subfc[o][.] 


Subtract From Carrying 


xo 


31 


136 


SR 


87 


subfe[o][.] 


Subtract From Extended 


D 


8 




SR 


85 


subfic 


Subtract From Immediate Carrying 


XO 


31 


232 


SR 


88 


subfme[o][.] 


Subtract From Minus One Extended 


XO 


31 


200 


SR 


89 


subfze[o][.] 


Subtract From Zero Extended 


X 


31 


598 




80 


sync 


Synchronize 


X 


31 


68 


0 


104 


td 


Trap Doubleword 


D 


2 




0 


102 


tdi 


Trap Doubleword Immediate 


X 


31 


370 




445 (III) 


dbia 


TLB Invalidate All 


X 


31 


306 




444 (III) 


tlbie 


TLB Invalidate Entry 


X 


31 


566 




445 (III) 


dbsync 


TLB Synchronize 


X 


31 


4 




105 


tw 


Trap Word 


D 


3 






103 


twi 


Trap Word Immediate 
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Form 


Opcode 


Mode 
Dep.^ 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


X 


31 


316 


SR 


110 


xor[.] 


XOR 


D 


26 






108 


xori 


XOR Immediate 


D 


27 






108 


xoris 


XOR Immediate Shifted 



^Key to Mode Dependency Column 

The entry is shown in parentheses () if the instruction is defined only for 64-bit implementations. 
The entry is shown in braces {} if the instruction is defined only for 32-bit implementations. 



blank The instruction has no mode dependence, except that if the instruction refers to storage when in 32-bit mode, only 
the low-order 32 bits of the 64-bit effective address are used to address storage. Storage reference instructions in- 
clude loads, stores, branch instructions, etc. 

CT If the instruction tests the Count Register, it tests the low-order 32 bits when in 32-bit mode and all 64 bits when 
in 64-bit mode. 

SR The instruction's primary function is mode-independent, but the setting of status registers (such as XER and CRO) 
is mode-dependent. 



Book I PowerPC User Instruction Set Architecture 




tecture and PowerPC Virtual Environ- 



ment Architecture. It covers the storage 
model and related instructions and 
facilities available to the application 
programmer, and the Time Base as 
seen by the application programmer 



PowerPC 
Virtual 
Environment 
Architecture 



Chapter 1 Storage Model 319 

1.1 Definitions and Notation 319 

1.2 Introduction 321 

1.3 Virtual Storage 321 

1.4 Single-Copy Atomicity 322 

1.5 Memory Coherence 323 

1.6 Storage Control Attributes 325 

1.7 Cache Models 327 

1.8 Shared Storage 333 

Chapter 2 Effect of Operand Placement on 

Performance 339 

2.1 Instruction Restart 341 

2.2 Atomicity and Order 342 

Chapter 3 Storage Control Instructions 343 

3.1 Parameters Useful to Application Programs 343 

3.2 Cache Management Instructions 344 

3.3 Enforce In-order Execution of I/O Instruction 350 

Chapter 4 Time Base 35i 

4.1 Time Base Instructions 352 

4.2 Reading the Time Base on 64-bit 
Implementations 353 

4.3 Reading the Time Base on 32-bit 
Implementations 354 

4.4 Computing Time of Day from the Time Base 354 

Appendix A Cross-Reference for Changed 

POWER Mnemonics 359 

Appendix B New Instructions 361 

Appendix C PowerPC Virtual Environment 

Instruction Set 363 



storage Model 




1.1 Definitions and Notation 

The following definitions, in addition to those specified in Book I, are 
used in this Book. 

■ processor 

A hardware component that executes the PowerPC instructions speci- 
fied in a program. 

■ system 

A combination of processors, storage, and associated mechanisms that 
is capable of executing programs. Sometimes the reference to system 
includes services provided by the operating system. 

■ main storage 

The level of the storage hierarchy in which all storage state is visible to 
all processors and mechanisms in the system. 

■ sequential execution 

A model for the execution of a sequence of instructions (program) in 
which one instruction is executed and completed before the next 
instruction is begun. Instructions are executed in the order in which 
they appear in the program, except following the execution of a 
branch instruction, which causes sequential execution to continue at 
the location specified by the branch instruction. 

■ program order 

The execution of instructions in the strict order in which they occur in 
the program. See sequential execution above. 
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■ storage location 

One or more sequential bytes of storage beginning at the address com- 
puted by a Storage Access instruction or by the instruction fetching 
mechanism. The number of bytes comprising the location depends on 
the type of Storage Access instruction being executed, or is four for 
instruction fetching. 

■ load 

An instruction that copies one or more bytes from a storage location 
to one or more registers (GPRs or FPRs). 

■ store 

An instruction that copies one or more bytes from one or more regis- 
ters (GPRs or FPRs) to a storage location. 

■ uniprocessor 

A system that contains one PowerPC processor. 

■ multiprocessor 

A system that contains two or more PowerPC processors. 

■ shared storage multiprocessor 

A multiprocessor that contains some common storage, which all the 
PowerPC processors in the system can access. 

■ performed 

A load is performed with respect to all other processors (and mecha- 
nisms) when the value to be returned by the load can no longer be 
changed by a subsequent store by any processor (or other mechanism). 
A store is performed with respect to all other processors (and mecha- 
nisms) when any load from the same location used by the store returns 
the value stored (or a value stored subsequently). 

■ page 

A unit of storage for which protection and control attributes are inde- 
pendently specifiable and for which reference and change status are 
independently recorded. 

■ block 

The aligned unit of storage operated on by each Cache Management 
instruction. The size of a block can vary by instruction and by imple- 
mentation. The maximum block size is one page. 

■ aligned storage access 

A load or store is aligned if the address of the target storage location is 
a multiple of the size of the transfer effected by the instruction. 
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■ atomic access 

A storage access executed by a processor during which no other pro- 
cessor or mechanism can access any byte of the target location 
between the time the processor performing the access accesses any 
byte of the location and the time that it completes the access to all 
bytes of that location. 

1.2 Introduction 

The PowerPC User Instruction Set Architecture, discussed in Book I, 
defines storage as a linear array of bytes indexed from 0 to a maximum of 
2^^ - \{2^'^ - 1}. Each byte is identified by its index, called its address, 
and each byte contains a value. This information is sufficient to allow the 
programming of applications that require no special features of any par- 
ticular system environment. The PowerPC Virtual Environment Architec- 
ture, described herein, expands this simple storage model to include 
caches, virtual storage, and shared storage multiprocessors. The PowerPC 
Virtual Environment Architecture, in conjunction with services based on 
the PowerPC Operating Environment Architecture (see Book III) and 
provided by the operating system, permits explicit control of this 
expanded storage model. A simple model for sequential execution allows 
at most one storage access to be performed at a time and requires that all 
storage accesses appear to be performed in program order. In contrast to 
this simple model, the PowerPC architecture specifies a relaxed model of 
memory consistency. In a multiprocessor system that allows multiple cop- 
ies of a location, aggressive implementations of the architecture can per- 
mit intervals of time during which different copies of a location have 
different values. This chapter describes features of the PowerPC architec- 
ture that enable programmers to write correct programs for this memory 
model. 

1.3 Virtual Storage 

The PowerPC system implements a virtual storage model for applica- 
tions. This means that a combination of hardware and software can 
present a storage model that allows applications to exist within a "vir- 
tual" address space larger than either the effective address space or the 
real address space. 

Each program can access 2^^ {2?^} bytes of "effective address" (EA) 
space, subject to limitations imposed by the operating system. In a typical 



Book II PowerPC Virtual Environment Architecture 



322 



Chapter 1 Storage Model 



PowerPC system, each program's EA space is a subset of a larger "virtual 
address" (VA) space managed by the operating system. 

Each effective address is translated to a real address (i.e., to an address 
in real storage) before being used to access storage. The hardware accom- 
plishes this, using the address translation mechanism described in Book 
III, Chapter 4, "Storage Control," on page 391. The operating system 
manages the real (physical) storage resources of the system, by setting up 
the tables and other information used by the hardware address transla- 
tion mechanism. 

Book II deals primarily with effective addresses that are in "ordinary 
segments" translated by the "segmented address translation mechanism" 
(see Book III, Chapter 4). Each such effective address lies in a "virtual 
page," which is mapped to a "real page" before data in the virtual page 
are accessed. 

In general, main storage may not be large enough to map all the vir- 
tual pages used by the currently active applications. With support pro- 
vided by hardware, the operating system can attempt to use the available 
real pages to map a sufficient set of virtual pages of the applications. If a 
sufficient set is maintained, "paging" activity is minimized. If not, perfor- 
mance degradation is likely. 

The operating system can support restricted access to virtual pages 
(including read-write, read-only, and no access; see Book III, Section 
4.10, "Storage Protection," on page 436), based on system standards 
(e.g., program code might be read-only) and application requests. 

1.4 Single-Copy Atomicity 

An access is single-copy atomic, or simply atomic, if it is always per- 
formed in its entirety with no visible fragmentation. Atomic accesses are 
thus serialized: each happens in its entirety in some order, even when that 
order is not specified in the program or enforced between processors. 
In PowerPC the following single-register accesses are always atomic: 

■ byte accesses (all bytes are aligned on byte boundaries) 

■ halfword accesses aligned on halfword boundaries 

■ word accesses aligned on word boundaries 

■ doubleword accesses aligned on doubleword boundaries (64-bit imple- 
mentations only) 
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No other accesses are guaranteed to be atomic. In particular, multiple- 
register loads and stores are not atomic, nor are floating-point double- 
word accesses on a 32-bit implementation. 

The results for several combinations of loads and stores to the same or 
overlapping locations are described below. 

1 . When two processors execute atomic stores to locations that do not 
overlap, and no other stores are performed to those locations, the con- 
tent of those locations is the same as if the two stores were performed 
by a single processor. 

2. When two processors execute atomic stores to the same storage loca- 
tion, and no other store is performed to that location, the content of 
that location is the result stored by one of the processors. 

3. When two processors execute stores that have the same target location 
and are not guaranteed to be atomic, and no other store is performed 
to that location, the result is some combination of the bytes stored by 
both processors. 

4. When two processors execute stores to overlapping locations, and no 
other store is performed to those locations, the result is some combina- 
tion of the bytes stored by the processors to the overlapping bytes. The 
portions of the locations that do not overlap contain the bytes stored 
by the processor storing to the location. 

5. When a processor executes an atomic store to a location, a second 
processor executes an atomic load from that location, and no other 
store is performed to that location, the value returned by the load is 
the content of the location prior to the store or the content of the loca- 
tion subsequent to the store. 

6. When a load and a store with the same target location can be executed 
simultaneously, and no other store is performed to the location, the 
value returned by the load is some combination of the content of the 
location before the store and after the store. 

1.5 Memory Coherence 

Coherence refers to the ordering of writes to a single location. Atomic 
stores to a given location are coherent if they are serialized in some order, 
and no processor is able to observe any subset of those stores as occur- 
ring in a conflicting order. This serialization order is an abstract sequence 
of values; the physical memory location need not assume each of the val- 
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ues written to it. For example, if a processor has a store-in cache, it may 
update a location several times before the value is written to the physical 
memory. The result of a store operation is not available to every proces- 
sor at the same instant, and it may be that a processor observes only some 
of the values that are written to a location. However, when a location is 
accessed atomically and coherently by all processors, then, for any pro- 
cessor, the sequence of values it loads from the location during any inter- 
val of time forms a subsequence of the sequence of values that the 
location logically held during that interval. That is, a processor can never 
load a "newer" value first and then, later, load an "older" value. 

As noted in Section 1.6, "Storage Control Attributes," on page 325, 
the coherence of storage locations may be managed by hardware or soft- 
ware depending on the setting of the Memory Coherence attribute. 

Memory coherence is managed in blocks called coherence blocks. 
Their size is implementation-dependent (see the Book IV, PowerPC 
Implementation Features document for the implementation), but is usu- 
ally larger than a word and often the size of a cache block. 



1.5.1 Coherence Required 

When a storage location is in Memory Coherence Required mode, each 
store to that location must be serialized with all stores to that location by 
all other processors that also access the location coherently. This require- 
ment can be satisfied, for example, by implementing an ownership proto- 
col that allows at most one processor at a time to store to the location. 

Coherence does not ensure that the result of a store by one processor 
will be immediately visible to all other processors and mechanisms in the 
system. Only after a program has executed the sync instruction are the 
previous storage accesses it has executed guaranteed to have been per- 
formed with respect to all other processors and mechanisms. 



Programming Note 

In a single-cache system. 
Coherence Required 
mode is not necessary for 
correct coherent 
execution. In fact in such 
a system. Coherence Not 
Required mode may give 
better performance. 



1.5.2 Coherence Not Required 

when a storage location is in Memory Coherence Not Required mode, 
storage coherence need not be enforced. This coherence mode may be 
selected by software to improve performance when it is known that the 
particular area of storage the processor is accessing will not be accessed 
by another processor or mechanism. In this mode, software must ensure 
that the appropriate Cache Management instructions have been used to 
put storage in a consistent state prior to changing the mode or allowing 
access to that storage area by a different processor or mechanism. 
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1.6 Storage Control Attributes 

Some operating systems may provide means to allow programs to specify 
storage control attributes not described in Book II. The definition of these 
attributes can be found in Book III, Section 4.8, "Storage Access Modes," 
on page 429. The following describes what an operating system that sup- 
ports these functions is expected to provide. The details may vary among 
operating systems, so the details of the specific system being used must be 
known before these functions can be used. 

Generally, the program may use one of each of the following pairs of 
storage attributes: 

■ Write Through Required or Not Required 

■ Caching Inhibited or Allowed 

■ Memory Coherence Required or Not Required 

■ Guarded or Not Guarded 

Not all combinations of these modes are supported; see Book III, Sec- 
tion 4.8.2, "Supported Storage Modes," on page 431 for further details. 

A program can specify, through an operating system service, the 
attributes for each virtual page to which it has access. Each load or store 
will be performed in the following manner, depending on the setting of 
the storage control attributes for the page containing the addressed stor- 
age location. 

Write Through 

This attribute is meaningful only for Caching Allowed storage. It pro- 
vides the program control over whether: 

■ the processor is required to update the copy of the storage location 
in the cache and in main storage, or 

■ the processor is allowed to update the copy of the storage location 
in the cache and to defer the update of main storage. 

Required 

Loads use the copy in the cache if it is there. Stores update the copy 
of the storage location in the cache if it is in the cache and also up- 
date the storage location in main storage. 

Not Required 

Loads and stores use the copy in the cache if it is there. The block 
containing the target storage location may be copied to the cache. 
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The storage location in main storage need not contain the value 
most recently stored to that location. 

Caching 
Inhibited 

When caching is inhibited, the Write Through attribute has no 
meaning. The load or store is executed in the following manner: 

1 . The operation is performed to main storage bypassing the cache 
(i.e., neither the target location nor any of the block(s) contain- 
ing it is copied into the cache). 

2. The operation causes an access (load/store) of appropriate length 
(i.e., byte, halfword, word, etc.) to the target location in main 
storage. 

It is considered a programming error if a copy of the target location 
of an access to Caching Inhibited storage is in the cache. Software 
must ensure that the location has not previously been brought into 
the cache or, if it has, that it has been flushed from the cache. If the 
programming error occurs, the result of the access is boundedly un- 
defined. 

Allowed 

When caching is allowed, the access is performed in the following 
manner: 

1 . If the block containing the target storage location is in the cache, 
it is used. 

2. If the block containing the target location is not in the cache, the 
block(s) of storage containing the target location may be copied 
to the cache and, if the access is a store, the target location is 
updated in the cache if it is in the cache. 

Memory Coherence 

This attribute provides the program control over whether the proces- 
sor maintains storage coherence: 
Required 

Stores by all processors to the same location are serialized into some 
order and no processor is able to observe any subset of those stores 
as occurring in a conflicting order. 

Not Required 

The order in which one processor observes the stores performed by 
one or more other processors is undefined. 



Programming Note 

Software must ensure 
that all locations in a 
page liave been purged 
from tiie caclie prior to 
clianging tlie storage 
mode for tlie page from 
Cacliing Allowed to 
Caching Inhibited. 
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When coherence is required, its serialization function is effective for all 
supported combinations of the Write Through and Caching modes 
(see Book III, Section 4.8.2, "Supported Storage Modes," on 
page 431). 

When coherence is not required, the programmer must manage the 
coherence of storage through use of sync and Cache Management 
instructions, and facilities provided by the operating system. 

Guarded 

This attribute provides the program control over the conditions under 
which data and instructions can be accessed speculatively. See Book III, 
Section 4.2.5, "Speculative Execution," on page 396 for a more com- 
plete definition. 

Guarded 

Data cannot be accessed speculatively, and instructions cannot be 
fetched speculatively, except under the conditions described in Book 
III. 

Not Guarded 

Data can be accessed speculatively, and instructions can be fetched 
speculatively. 

1.7 Cache Models 

The PowerPC architecture does not require any particular cache organi- 
zation and allows many different implementations. However, for a pro- 
gram to execute correctly on all implementations, the programmer 
should assume that separate instruction and data caches exist, and should 
program to the separate cache model. The functions of these caches are 
affected by the storage control attributes associated with each storage 
access as described in Section 1.6, "Storage Control Attributes," on 
page 325. Cache Management instructions are provided so programs can 
manage the caches when needed. Depending on the storage control 
attributes specified by the program and the function being performed, the 
program may need to use these instructions to guarantee that the func- 
tion is performed correctly. The Cache Management instructions are also 
useful to optimize the use of memory bandwidth in such applications as 
graphics and numerically intensive computing. 

The processor is not required to maintain copies of storage locations 
in the instruction cache consistent with changes to storage resulting from 
the execution of store instructions. Program management of the cache is 
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Programming Note 

Implementations will 
vary as to what 
instructions need be 
executed to perform a 
function such as code 
modification. Operating 
systems are encouraged 
to provide a service 
(implementation- 
dependent) to perform 
the function in an 
efficient manner. 



required when the program generates or modifies code that will be exe- 
cuted (i.e., when the program modifies data in storage and then attempts 
to execute the modified data as instructions). 

The instructions provided allow the program to: 

■ invalidate the copy of storage in an instruction cache block (icbi) 

■ perform context synchronization, as described in Book III, Section 
1.7.1, "Context Synchronization," on page 371 (isync) 

■ give a hint that a block of storage should be copied into the data 
cache, so that the copy of the block may be in the cache when subse- 
quent accesses to the block occur, thereby reducing delays (debt, 
dcbtst) 

■ set the content of a data cache block to zeros [dcbz) 

■ copy the content of a data cache block to main storage (dcbst) 

■ copy the content of a (4ata cache block to main storage and make the 
copy of the block in the data cache invalid {dcbf) 

The function of the Cache Management instructions depends on the 
implementation of the caches and on the storage control attributes associ- 
ated with the cache block that is the target of the Cache Management 
instruction. 

There are many variations of cache implementations and the following 
sections do not attempt to describe them exhaustively. However, the vari- 
ations that affect the function of the Cache Management instructions are 
discussed here. 



1.7.1 Split or Dual Caches 

A cache model in which there are separate caches for instructions and 
data is called a "Harvard-style" cache. This style is the standard 
PowerPC cache model; that is, it is the model assumed by this architec- 
ture, and the function of the Cache Management instructions depends on 
this model as well as on the storage control attributes of the target stor- 
age block. A copy of a target block in the cache is said to be marked 
invalid if it will not be used for subsequent accesses. The following sec- 
tions describe the functions performed by each of the Cache Management 
instructions in this model. 
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Instruction Cache Block Invalidate 

This instruction permits the program to invalidate the target storage 
block in the instruction cache, causing any subsequent fetch request for 
an instruction in the block to be sent to main storage (because the block 
is not found in the instruction cache). The instruction performs the fol- 
lowing operations: 

1 . If the target block is not accessible to the program for loads, the sys- 
tem data storage error handler may be invoked. 

2. Memory Coherence 
Required 

If the target block is in any of the instruction caches in the system, 
it is marked invalid in those caches. 

Not Required 

If the target block is in the instruction cache of the executing pro- 
cessor, it is marked invalid in that cache. 

3. This access need not be recorded, but if it is it is considered a load and 
not a store. 

Data Cache Block Touch 

The two Touch instructions (one for reading, the other for writing) per- 
mit the program to attempt to have the target storage block in the cache 
prior to its first use, and thereby to avoid some of the delays due to 
accessing storage. These instructions are performance hints, and perform 
the following operations: 

1 . If the target block is not accessible to the program for loads, no other 
operation is performed. 

2. Caching 
Inhibited 

The target block is not copied into the data cache of the executing 
processor and no other operations are performed. 

Allowed 

Memory Coherence 

Required 

If the target block is not in the data cache of the executing 
processor, the most recent version of the block may be copied 
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into that data cache. 
Not Required 

If the target block is not in the data cache of the executing 
processor, the block may be copied into that data cache from 
main storage without regard for the location of the most 
recently modified version. 

3. This access need not be recorded, but if it is it is considered a load and 
not a store. 

If the instruction is Touch for Store and the block is copied into the 
cache, it is copied in a manner such that a subsequent store to the block 
will execute efficiently. 

The execution of either of these instructions never causes the system 
data storage error handler to be invoked. 

Data Cache Block set to Zero 

This instruction permits the program to set large areas of storage to zeros 
in an efficient manner. The instruction performs the following operations: 

1 . If the target block is not accessible to the program for stores, the sys- 
tem data storage error handler is invoked. 

2. Write Through Required 

Either each byte of the target block in main storage is set to 0x00, or 
the system alignment error handler is invoked. 

3. Caching Inhibited 

Either each byte of the target block in main storage is set to 0x00, or 
the system alignment error handler is invoked. 

4. Memory Coherence 
Required 

■ If the target block is in the data cache of the executing processor, 
each byte in the block is set to 0x00 and all copies of the block in 
all data caches in the system are made consistent. 

■ If the target block is not in the data cache of the executing proces- 
sor, the block is established in that data cache without fetching it 
from main storage and each byte in the block is set to 0x00. All 
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copies of the block in all data caches in the system are made consis- 
tent. 

Not Required 

■ If the target block is in the data cache of the executing processor, 
each byte in the block is set to 0x00. 

■ If the target block is not in the data cache of the executing proces- 
sor, the block is established in that data cache without fetching it 
from main storage and each byte in the block is set to 0x00. 

5. This access must be recorded. It is considered a store. 

Data Cache Block Store 

This instruction permits the program to ensure that the latest version of 
the target storage block is in main storage. The instruction performs the 
following operations: 

1 . If the target block is not accessible to the program for loads, the sys- 
tem data storage error handler may be invoked. 

2. Memory Coherence 
Required 

If the target block is in any of the data caches in the system and has 
been modified, it is copied to main storage. 

Not Required 

If the target block is in the data cache of the executing processor 
and has been modified, it is copied to main storage. 

3. This access need not be recorded, but if it is it is considered a load and 
not a store. 

Data Cache Block Flush 

This instruction permits the program to ensure that the latest version of 
the target storage block is in main storage and no longer in the data 
cache. The instruction performs the same operations as does Data Cache 
Block Store, In addition to those operations, the following is done. 
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Memory Coherence 
Required 

If the target block is in any of the data caches in the system, it is 
marked invaHd in those data caches. 

Not Required 

If the target block is in the data cache of the executing processor, it 
is marked invalid in that data cache. 

1.7.2 Combined Cache 

A combined cache implementation provides a single cache for instruc- 
tions and data. For this implementation, the Instruction Cache Block 
Invalidate instruction need not perform the same operations as it would 
for an implementation with separate caches. The instruction is treated as 
a no-op, except that it is acceptable to invalidate the target block in the 
instruction caches of other processors if the block is in Memory Coher- 
ence Required mode. 

1.7.3 Write Through Data Cache 

The Cache Management instructions affected by the write through imple- 
mentation of the data cache are listed in this section. These instructions 
must perform all the operations specified for a Harvard-style cache 
except as specified in this section. Some of the differences depend on 
whether the write through implementation is a write through to main 
storage or just a write through to a second level of cache. 

Write Through to Main Storage 

1. Data Cache Block set to Zero 

The processor may invoke the system alignment error handler regard- 
less of the setting of the storage control attributes. 

2. Data Cache Block Store 

By definition, the cache cannot contain a modified block. The proces- 
sor is not required to copy the target block to main storage. 

3. Data Cache Block Flush 

By definition, the cache cannot contain a modified block. The proces- 
sor is not required to copy the target block to main storage. 



Book II PowerPC Virtual Environment Architecture 



1.8 Shared Storage 



333 



Write Through to Multilevel Cache 

For Data Cache Block set to Zero, the processor may invoke the system 
alignment error handler regardless of the setting of the storage control 
attributes. 

If a cache is the interface to main storage for all processors and other 
mechanisms that access storage, that cache can be considered main stor- 
age with respect to the Cache Management instructions. Otherwise, the 
cache instructions that cause the content of a cache block to be copied 
back to main storage or to be marked invalid must be performed against 
all levels of the cache. 

1.8 Shared Storage 

This architecture supports the sharing of storage between programs, 
between different instances of the same program on systems with one or 
more processors, and between processors and other mechanisms. It also 
supports access to a storage location by one or more programs using dif- 
ferent effective addresses. All these cases are considered storage sharing. 
Storage is shared in blocks that are an integral number of pages. 

When the same storage location has different effective addresses, the 
addresses are said to be aliases. Each application can be granted separate 
access privileges to aliased pages. 

1.8.1 Storage Access Ordering 

The PowerPC Architecture specifies a weakly consistent storage model 
for shared storage multiprocessor systems. This model provides an 
opportunity for significantly improved performance over the strongly 
consistent model, but places the responsibility on the program to ensure 
that ordering or synchronization instructions are properly placed when 
necessary for the correct execution of the program. 

In this architecture, the order in which the processor performs storage 
accesses, the order in which those accesses complete in main storage, and 
the order in which those accesses are viewed as occurring by another pro- 
cessor may all be different. This property is referred to as storage access 
ordering, A means of enforcing an ordering of storage accesses is pro- 
vided to allow programs or instances of programs to share storage. Simi- 
lar means are needed to allow programs executing on a processor to 
share storage with some other mechanism, such as an I/O device, that can 
also access storage. 
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The purpose of specifying a weakly consistent storage model is to 
allow the processor to run very fast for most storage accesses. Two 
instructions. Enforce In-order Execution of I/O and Synchronize, are 
provided to enable the program to control the order in which storage 
accesses are performed by separate instructions. No ordering should be 
assumed for the storage accesses done by a multiple-register load or store 
instruction, and no means are provided for controlling that order. 

Enforce In-order Execution of I/O 

This instruction permits the program to control the order in which loads 
and stores are performed in main storage when the accessed storage has 
certain attributes. The data accesses affected by eieio are: loads and 
stores to storage that is both Caching Inhibited and Guarded; and stores 
to storage that is Write Through Required. All applicable data accesses 
are ordered as a single set (i.e., there is not one order for loads and stores 
to Caching Inhibited and Guarded storage, and another order for stores 
to Write Through Required storage), eieio does not affect the order of 
other data accesses, or of cache operations (whether caused explicitly by 
execution of a Cache Management instruction or implicitly by the cache 
coherence mechanism). 

eieio ensures that all applicable data accesses to main storage previ- 
ously initiated by the processor have completed with respect to main stor- 
age before any applicable storage accesses subsequently initiated by the 
processor access main storage. It acts like a barrier that flows through the 
storage queues and to main storage, preventing the reordering of storage 
accesses across the barrier. The eieio instruction may complete before 
previously initiated storage accesses have been performed with respect to 
other processors and mechanisms. 

eieio can be used, for example, to ensure that the data from a sequence 
of stores to the control registers of an I/O device update those control 
registers in the order specified by the stores as ordered by eieio. 

If stronger ordering is desired or if it is necessary to order accesses to 
storage that may be in the cache, the sync instruction must be used. 

Synchronize 

When a portion of storage must be forced to a known state, it is neces- 
sary to synchronize storage with respect to all processors and mecha- 
nisms. This synchronization is accomplished by requiring programs to 
indicate explicitly in the instruction stream, by inserting a sync instruc- 
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tion, that synchronization is required. Only when sync completes are the 
effects of all storage accesses previously executed by the program guaran- 
teed to have been performed with respect to all other processors and 
mechanisms. 

The sync instruction permits the program to ensure that all storage 
accesses it has initiated have been performed with respect to all other pro- 
cessors and mechanisms before its next instruction is executed. A pro- 
gram can use this instruction to ensure that all updates to a shared data 
structure are visible to all other processors prior to executing a store that 
will release the lock on that data structure. Execution of this instruction 
does the following: 

■ Performs the functions described for the sync instruction in Book I, 
"Synchronize X-form," on page 80. 

■ Ensures that consistency operations and the effects of icbi, dcbz, 
dcbst, dcbf^ and debt instructions (see Book III, "Data Cache Block 
Invalidate X-form," on page 439) previously executed by the proces- 
sor executing the sync have completed on all other processors. 

■ Ensures that TLB invalidates previously executed by the processor exe- 
cuting the sync have completed on that processor, sync does not wait 
for such invalidates to complete on other processors (see Book III, Sec- 
tion 4.12, "Table Update Synchronization Requirements," on 

page 446). 

■ Ensures that storage accesses due to instructions previously executed 
by the processor executing the sync are recorded in the Reference and 
Change bits in the Page Table (see Book III, Section 4.9, "Reference 
and Change Recording," on page 433). 

The sync instruction is execution synchronizing (see Book III, Section 
1.7.2, "Execution Synchronization," on page 372). It is not context syn- 
chronizing (see Book III, Section 1.7,1, "Context Synchronization," on 
page 371), and therefore need not discard prefetched instructions. 

For storage that is maintained as Memory Coherence Not Required, 
the only effect of sync on storage operations is to ensure that all previous 
storage accesses have completed to the level of storage specified by the 
Caching and Write Through storage control attributes (including the 
updating of Reference and Change bits). 
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1.8.2 Atomic Update Primitives 

The Load And Reserve and Store Conditional instructions together per- 
mit atomic update of a storage location. 64-bit implementations have 
word and doubleword forms of each of these instructions. Described here 
is the operation of the word forms Iwarx and sttvcx.; operation of the 
doubleword forms Idarx and stdcx. is the same except for obvious substi- 
tutions. 

These instructions function in Caching Inhibited, as well as in Caching 
Allowed, storage. The addressed page must, however, have the Memory 
Coherence Required attribute for every processor, other than the one 
doing the atomic update, that might execute a store to the location being 
atomically updated. The remainder of this section assumes that if the sys- 
tem is a multiprocessor, then all processors have the addressed page in 
Memory Coherence Required mode. 

If the addressed storage is in Write Through Required mode, it is 
implementation-dependent whether these instructions function correctly 
or cause the system data storage error handler to be invoked. 

The Iwarx instruction is a load from a word-aligned location that has 
two side effects. 

1 - A nonspecific reservation for a subsequent stwcx. or stdcx. is created. 

2. The storage coherence mechanism is notified that a reservation exists 
for the real address (see Book III, Chapter 4, "Storage Control," on 
page 391) corresponding to the storage location accessed by the 
Itaarx, 

The stwcx, instruction is a store to a word-aligned location that is con- 
ditioned on the existence of the reservation created by the Iwarx or Idarx. 
To emulate an atomic operation with these instructions, it is necessary 
that both the Iwarx and the stwcx. access the same storage location even 
though this requirement is not enforced by the hardware. Iwarx and 
stwcx. are ordered by a dependence on the reservation, and the program 
is not required to insert other instructions to maintain the order of stor- 
age accesses by these two instructions. 

A stwcx. performs a store to the target storage location only if the 
storage location accessed by the Iwarx that established the reservation 
has not been stored into by another processor or mechanism between 
supplying a value for the Iwarx and storing the value supplied by the 
stwcx. i In this case, CRO is set to indicate that the store was performed. 

If the stwcx. completes but does not perform the store because a reser- 
vation no longer exists, CRO is set to indicate that the stwcx. completed 
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but storage was not altered. 

Examples of the use of Iwarx and stwcx, are given in Book I, Section 
E.l, "Synchronization," on page 249. 

When stwcx. to a given location succeeds, its store has been performed 
but may not yet be visible to all other processors and mechanisms. As a 
result, a subsequent load or Iwarx from the given location on another 
processor may return a "stale" value. Hov^ever, a subsequent Iwarx from 
the given location on the other processor followed by a successful stwcx. 
on that processor is guaranteed to have returned the value stored by the 
first processor's stwcx. (in the absence of other stores to the given loca- 
tion). 



Reservations 

The ability to emulate an atomic operation using Iwarx and stwcx. is 
based on the conditional behavior of stwcx. ^ the reservation set by Iwarx ^ 
and the clearing of that reservation if the target location is modified by 
another processor or mechanism before the stwcx. performs its store. 

A processor has at most one reservation at any time. A reservation is 
established by executing a Iwarx instruction and is lost if any of the fol- 
lowing occur: 

■ The processor holding the reservation executes another Iwarx or 
Idarx; this clears the first reservation and establishes a new one. 

■ The processor holding the reservation executes any stwcx. or stdcx., 
whether or not its address matches that of the Iwarx. 

■ Some other processor executes a store or dcbz to the same reservation 
granule, or modifies a Reference or Change bit (see Book III, Section 
4.9, "Reference and Change Recording," on page 433) in the same 
reservation granule. 

■ Some other mechanism modifies a storage location in the same reser- 
vation granule. 

■ Any additional causes of reservation loss are described in the Book IV, 
PowerPC Implementation Features document for the implementation. 

Interrupts (see Book III, Chapter 5, "Interrupts," on page 453) do not 
clear reservations (however, system software invoked by interrupts may 
clear reservations). Immunity to random reservation loss ensures that 
programs using Iwarx and stwcx. can make forward progress. 



Programming Note 

To ensure that a store or 
stwcx. to a given location 
has been performed 
with respect to all other 
processors and 
mechanisms, it must be 
followed by a sync. A 
subsequent load or 
Iwarx from the given 
location by another 
processor will then return 
a value at least as recent 
as the value stored. This is 
often more 

synchronization than is 
actually needed to ensure 
program correctness. 

Programming Note 

One use of Iwarx and 
stwcx. is to emulate a 
"Compare and Swap" 
primitive like that provided 
by the IBM System/370 
Compare and Swap 
instruction: see Book I, 
"Compare and Swap," on 
page 252. A System/370- 
style Compare and Swap 
checks only that the old 
and current values of the 
word being tested are 
equal, with the result that 
programs that use such a 
Compare and Swap to 
control a shared resource 
can err if the word has 
been modified and the old 
value subsequently 
restored. The combination 
of Iwarx and stwcx. 
improves on such a 
Compare and Swap, 
because the reservation 
reliably binds the Iwarx 
and stwcx. together. The 
reservation is always lost if 
the word is modified by 
another processor or 
mechanism between the 
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Iwarx and stwcx., so the 
stwcx. never succeeds 
unless the word has not 
been stored into (by 
another processor or 
mechanism) since the 
Iwarx. 

Programming Note 

Programming 
convention must ensure 
that Iwarx and stwcx. 
addresses match. In 
proper use, a stwcx. 
should be paired with a 
specific Iwarx to the 
same real address. 
Situations in which a 
stwcx. may erroneously 
be issued after some 
Iwarx other than that 
with which it is intended 
to be paired must be 
scrupulously avoided. For 
example, there must not 
be a context switch in 
which the processor holds 
a reservation in behalf of 
the old context, and the 
new context resumes 
after a Iwarx and before 
the paired stvi^cx.. The 
stwcx. in the new context 
would complete 
successfully, which is not 
what was intended by 
the programmer. 

Such a situation must be 
prevented by issuing a 
stwcx. to a dummy 
writable word-aligned 
location as part of the 
context switch, thereby 
clearing any reservation 
established by the old 
context. Executing 
stwcx. to a word-aligned 
location suffices to clear 
the reservation, vvhether 
it was obtained by Iwarx 
or Idarx. 



Guaranteeing Forward Progress 

Forward progress in loops that use Iwarx and stwcx. is guaranteed by a 
cooperative effort between hardware, operating system software, and 
appUcation software. Hardware guarantees that 

■ one stwcx. among a set of processors holding reservations to the same 
real address will succeed, and 

■ reservations are not lost unnecessarily, i.e., when the reserved location 
has not been modified. 

While no general rules can be given regarding operating system guar- 
antees, programs that use the examples in Book I, Section E.l, "Synchro- 
nization," on page 249 are guaranteed forward progress. 



Reservation Loss Due to Granularity 

When one processor holds a reservation and another processor performs 
a store that might clear that reservation, the address comparison is done 
in a way that ignores an implementation-dependent number of low-order 
bits of the real addresses. The storage block corresponding to the ignored 
low-order bits is called the reservation granule. Its size is implementation- 
dependent (see the Book IV, PowerPC Implementation Features docu- 
ment for the implementation) but is a multiple of the coherence block 
size. 

Lock words should be allocated such that contention for the locks and 
updates to nearby data structures do not cause excessive reservation 
losses from false indications of sharing that can occur due to the reserva- 
tion granularity. 

A processor holding a reservation on any word in a reservation gran- 
ule will lose its reservation if some other processor stores anywhere in 
that granule. Such problems can be avoided only by ensuring that few 
such stores occur. This can most easily be accompUshed by allocating an 
entire granule for a lock and wasting all but one word. 

Reservation granularity may vary for each implementation. There are 
no architectural restrictions bounding the granularity implementations 
must support, so reasonably portable code must dynamically allocate 
aligned and padded storage for locks to guarantee absence of granularity- 
induced reservation loss. 
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The placement (location and alignment) of operands in storage affects 
relative performance of storage accesses and may affect it significantly. 
The best performance is guaranteed if storage operands are aligned. In 
order to obtain the best performance across the widest range of imple- 
mentations, the programmer should assume the performance model 
described in Figures 46 and 47 with respect to the placement of storage 
operands. Figure 46 applies when the processor is in Big-Endian mode, 
and Figure 47 applies when the processor is in Little-Endian mode. Per- 
formance of accesses varies depending on the following: 

1 . Operand Size 

2. Operand Alignment 

3. Endian mode (Big-Endian or Little-Endian) 

4. Crossing no boundary 

5. Crossing a cache block boundary 

6. Crossing a page boundary that is also a protection boundary (see 
Book III, Section 4.10, "Storage Protection," on page 436) 

7. Crossing a BAT boundary (see Book III, Section 4.7, "Block Address 
Translation," on page 423) 

8. Crossing a segment boundary (see Book III, Section 4.2.1, "Storage 
Segments," on page 393) 
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The Load and Store Multiple instructions are defined to operate only 
on aligned operands. The Move Assist instructions have no alignment 
requirements. Both of these sets of instructions are supported only in Big- 
Endian mode. 

For the purposes of Figures 46 and 47, crossing a boundary between 
pages with different storage control attributes is equivalent to crossing a 
segment boundary. 
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Figure 46. Performance effects of storage operand placement; Big-Endian 
mode 
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Figure 47. Performance effects of storage operand placement, Little-Endian 
mode 



2.1 Instruction Restart 

If a storage access crosses a page boundary that is also a protection 
boundary, a BAT boundary, or a segment boundary, a number of condi- 
tions could cause the execution of the instruction to be aborted after part 
of the access has been performed. This may occur, for example, when a 
program attempts to access a page it has not previously accessed, or when 
the processor must check for a possible change in storage control 
attributes when an access crosses a page boundary. When this occurs, the 
implementation or the operating system may restart the instruction. If 
the instruction is restarted, some bytes of the location may be loaded 
from or stored to the target location a second time. 
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Programming Note 

The programmer should 
assume that any 
unaligned access in an 
ordinary segment might 
be restarted. Software 
can ensure that 
unaligned accesses are 
not restarted by placing 
the unaligned data in 
direct-store segments or 
BAT areas, neither of 
which permit instruction 
restart for crossing 
internal page boundaries 
(see Book III, Chapter 4, 
"Storage Control," on 
page 391). 

Unsynchronized TLB 
invalidates do not have a 
defined result. 



The following rules apply to storage accesses with regard to restarting 
the instruction. 

Aligned Accesses 

A single-register instruction that accesses an aligned operand is 
never restarted. 

Unaligned Accesses 

A single-register instruction that accesses an unaligned operand 
may be restarted if the access crosses a page, BAT, or segment 
boundary. 

Load and Store Multiple, Move Assist 

These instructions may be restarted if, in accessing the locations 
specified by the instruction, a page, BAT, or segment boundary is 
crossed. 



2.2 Atomicity and Order 

Access Atomicity 

With the exception of double-precision floating-point operands in 
32-bit implementations, all aligned accesses are atomic. No other 
access is required to be atomic. Instructions causing multiple ac- 
cesses (Load and Store Multiple and Move Assist) are not atomic. 

Access Order 

Since the ordering of storage accesses is not guaranteed unless the 
programmer inserts the appropriate ordering instructions, the or- 
der of accesses generated by a single instruction is not guaranteed. 
Unaligned accesses. Load and Store Multiple instructions, and 
Move Assist instructions have no implicit ordering characteristics. 
For example, processor A may store a word operand on an odd 
halfword boundary. It may appear to processor A that the store 
completed atomically. Processor or other mechanism B, executing 
a load from the same location, may get a result that is a combina- 
tion of the value of the first halfword that existed prior to the store 
by processor A and the value of the second halfword stored by pro- 
cessor A. 
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The instructions in this chapter are not privileged. For most of them, if 
the appUcable cache is not present the operation is a "no-op" and has no 
effect on any register or on storage. The only exception is the dcbz 
instruction. When the data cache does not exist, dcbz either zeros a cer- 
tain number of bytes of main storage (which has an effect similar to zero- 
ing bytes in a cache block which are later written to main storage) or 
invokes the system alignment error handler (so that its function can be 
simulated). 

As with other storage instructions, the effect of the Cache Manage- 
ment instructions on storage is weakly consistent. If the programmer 
needs to ensure that Cache Management or other instructions have been 
performed with respect to all other processors and mechanisms, a sync 
instruction must be placed in the program following those instructions. 

3.1 Parameters Useful to Application 
Programs 

It is suggested that the operating system provide a service that allows an 
application program to obtain the following information. 

1 - Page size 

2. Coherence block size 

3. Granule size for reservations 

4. An indicator of whether the processor has (a) a combined cache or 
no caches, or (b) some other cache configuration (split caches or one 
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cache only; if instruction cache fetches pass through the data cache, 
the cache is considered to be a spHt cache) 

5. Instruction cache size 

6. Data cache size 

7. Instruction cache hne size (see Book IV, PowerPC Implementation 
Features) 

8. Data cache hne size (see Book IV) 

9- Block size for icbi (if no instruction cache, number of bytes zeroed by 
dcbz) 

10- Block size for debt and dcbtst (if no data cache, number of bytes 
zeroed by dcbz) 

1 1 - Block size for dcbz^ dcbst, dcbf^ and dcbi [see Book III, Section 
4.11.1, "Cache Management Instructions," on page 438 for a 
description of dcbi (if no data cache, number of bytes zeroed by 
dcbz)] 

12. Instruction cache associativity 

13. Data cache associativity 

14. Factors for converting the Time Base to seconds 

If the caches are combined, the same value should be given for an 
instruction cache attribute and the corresponding data cache attribute. 

3.2 Cache Management Instructions 
3.2.1 Instruction Cache Instructions 

Instruction caches, if they exist, are not required to be consistent with 
data caches, storage, or I/O data transfers. Software must use the appro- 
priate Cache Management instructions to ensure that instruction caches 
are kept consistent when instructions are modified by the processor or by 
input data transfer. When a processor alters a storage location that may 
be contained in an instruction cache, software must ensure that updates 
to storage are visible to the instruction-fetching mechanism. Although 
the instructions to accomplish this vary among implementations and 
hence many operating systems will provide a system service for this func- 
tion, the following sequence is typical. 
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1 . dcbst — update storage 

2. sync — wait for update (see Book I, "Synchronize X-form," on 

page 80) 

3. icbi — invalidate copy in instruction cache 

4- isync — perform context synchronization (see Book III, Section 1.7.1, 
"Context Synchronization," on page 371) 

These operations are necessary because the storage may be in Write 
Through Not Required mode. Since instruction fetching may bypass the 
data cache, changes made to items in the data cache may not be reflected 
in storage until after the instruction fetch completes. 



Instruction Cache Block Invalidate X-form 
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Let the effective address (EA) be the sum (RAI0)+(RB). 

If the block containing the byte addressed by EA is in Coherence 
Required mode, and a block containing the byte addressed by EA is in the 
instruction cache of any processor, the block is made invalid in all such 
processors, so that subsequent references cause the block to be refetched. 

If the block containing the byte addressed by EA is in Coherence Not 
Required mode, and a block containing the byte addressed by EA is in the 
instruction cache of this processor, the block is made invalid in this pro- 
cessor, so that subsequent references cause the block to be refetched. 

The function of this instruction is independent of the Write Through 
Required/Not Required and Caching Inhibited/Allowed modes of the 
block containing the byte addressed by EA. 

It is acceptable for the processor to treat this instruction as a load from 
the addressed byte with respect to address translation, storage protection, 
and reference and change recording. Implementations with a combined 
data and instruction cache treat the icbi instruction as a no-op, except 
that they may invalidate the target block in the instruction caches of 
other processors if the block is in Memory Coherence Required mode. 

If the EA references storage outside of main storage (see Book III, Sec- 
tion 4.6, "Direct-Store Segments," on page 421), the instruction is 
treated as a no-op. 

Special Registers Altered 

None 
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Instruction Synchronize XL-form 

isync 

[Power mnemonic: ics] 
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This instruction waits for all previous instructions to complete and 
then discards any prefetched instructions, causing subsequent instruc- 
tions to be fetched (or ref etched) from storage and to execute in the con- 
text established by the previous instructions. This instruction has no 
effect on other processors or on their caches. 

This instruction is context synchronizing (see Book III, Section 1.7.1, 
"Context Synchronization," on page 371). 

Special Registers Altered 

None 



3.2.2 Data Cache Instructions 



Programming Note 

The purpose of the debt 
instruction is to allow the 
program to request a 
cache block fetch before 
it is actually needed by 
the program. The 
program can later 
perform loads to put 
data into registers. 
However, the processor is 
not obliged to load the 
addressed block into the 
data cache. 



Data caches and combined caches, if they exist, are required to be consis- 
tent with other data caches, combined caches, storage, and I/O data 
transfers. However, to ensure consistency, aHased effective addresses 
(two effective addresses that map to the same real address) must have the 
same page offset (see Section 1.8, "Shared Storage," on page 333). 

If the effective address references storage outside of main storage (see 
Book III, Section 4.6, "Direct-Store Segments," on page 421), the instruc- 
tion is treated as a no-op. 
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Let the effective address (EA) be the sum (RAIO)+(RB). 

This instruction is a hint that performance will probably be improved 
if the block containing the byte addressed by EA is fetched into the data 
cache, because the program will probably soon load from the addressed 
byte. Executing debt will not cause the system error handler to be 
invoked. 
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It is acceptable for the processor to treat this instruction as a load from 
the addressed byte with respect to address translation, storage protection, 
and reference and change recording, except that the system error handler 
must not be invoked for a translation or protection violation. 

Special Registers Altered 

None 



Data Cache Block Touch for Store X-form 

dcbtst RA,RB 
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Let the effective address (EA) be the sum (RAI0)+(RB). 

This instruction is a hint that performance will probably be improved 
if the block containing the byte addressed by EA is fetched into the data 
cache, because the program will probably soon store into the addressed 
byte. Executing dcbtst will not cause the system error handler to be 
invoked. 

It is acceptable for the processor to treat this instruction as a load from 
the addressed byte with respect to address translation, storage protection, 
and reference and change recording, except that the system error handler 
must not be invoked for a translation or protection violation. Since 
dcbtst does not modify storage, it must not be recorded as a store. 

Special Registers Altered 

None 



Programming Note 

The purpose of the dcbtst 
instruction is to allow the 
program to request a 
cache block fetch before 
it is actually needed by 
the program. The 
program can later 
perform stores to put 
data into storage. 
However, the processor is 
not obliged to load the 
addressed block into the 
data cache. 



Data Cache Block set to Zero X-form 

dcbz RA,RB 
[Power mnemonic: dclz] 
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Let the effective address (EA) be the sum (RAI0)+(RB). 

If the block containing the byte addressed by EA is in the data cache, 
all bytes of the block are set to zero. 

If the block containing the byte addressed by EA is not in the data 
cache and the corresponding page is Caching Allowed, the block is 
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Programming Note 

If the page containing 
the byte addressed by EA 
is Caching Inhibited or 
Write Through Required, 
the system alignment 
error handler should set 
to zero all bytes of the 
area of main storage that 
corresponds to the 
addressed block. 

See Book III, Section 
5.5.2, "Machine Check 
Interrupt," on page 459 
for discussion of a 
possible delayed Machine 
Check interrupt that can 
be caused by dcbz if the 
operating system has set 
up an incorrect storage 
mapping. 



established in the data cache without fetching the block from main stor- 
age, and all bytes of the block are set to zero. 

If the page containing the byte addressed by EA is Caching Inhibited 
or Write Through Required, then either (a) all bytes of the area of main 
storage that corresponds to the addressed block are set to zero, or (b) the 
system alignment error handler is invoked. 

If the block containing the byte addressed by EA is in Coherence 
Required mode, and the block exists in the data cache(s) of any other 
processor(s), it is kept coherent in those caches. 

This instruction is treated as a store to the addressed byte with respect 
to address translation, storage protection, and reference and change 
recording. 

Special Registers Altered 

None 

Data Cache Block Store X-form 
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Let the effective address (EA) be the sum (RAIO)+(RB). 

If the block containing the byte addressed by EA is in Coherence 
Required mode, and a block containing the byte addressed by EA is in the 
data cache of any processor and has been modified, the writing of it to 
main storage is initiated. 

If the block containing the byte addressed by EA is in Coherence Not 
Required mode, and a block containing the byte addressed by EA is in the 
data cache of this processor and has been modified, the writing of it to 
main storage is initiated. 

The function of this instruction is independent of the Write Through 
Required/Not Required and Caching Inhibited/Allowed modes of the 
block containing the byte addressed by EA. 

It is acceptable for the processor to treat this instruction as a load from 
the addressed byte with respect to address translation, storage protection, 
and reference and change recording. 

Special Registers Altered 

None 
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Data Cache Block Flush X-form 
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Let the effective address (EA) be the sum (RAIO)+(RB). 

The action taken depends on the storage mode associated with the tar- 
get and on the state of the block. The Hst below describes the action 
taken for the various cases. 

1- Coherence Required 

Unmodified Block 

Invalidate copies of the block in the caches of all processors. 

Modified Block 

Copy the block to storage. Invalidate copies of the block in the 
caches of all processors. 

Absent Block 

If modified copies of the block are in the caches of other processors, 
cause them to be copied to storage and invalidated. If unmodified 
copies are in the caches of other processors, cause those copies to 
be invaUdated. 

2. Coherence Not Required 

Unmodified Block 

Invalidate the block in the processor's cache. 

Modified Block 

Copy the block to storage. Invalidate the block in the processor's 
cache. 

Absent Block 
Do nothing. 

The function of this instruction is independent of the Write Through 
Required/Not Required and Caching Inhibited/Allowed modes of the 
block containing the byte addressed by EA. 

It is acceptable for the processor to treat this instruction as a load from 
the addressed byte with respect to address translation, storage protection, 
and reference and change recording. 

Special Registers Altered 

None 
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Programming Note 

The e/e/o instruction is 
intended for use in 
performing memory- 
mapped I/O (see Book III, 
PowerPC Operating 
Environment 
Architecture) and in 
preventing load/store 
combining operations in 
main storage, it can be 
thougiit of as placing a 
barrier into the stream of 
storage accesses issued by 
a processor, such that any 
given storage access 
appears to be on the 
same side of the barrier 
to both the processor and 
the I/O device. 

The e/e/o instruction may 
complete before 
previously initiated 
storage accesses have 
been performed with 
respect to other 
processors and 
mechanisms. 



3.3 Enforce in-order Execution of I/O 
Instruction 

Enforce In-order Execution of I/O X-form 

eieio 



31 


III 


III 


III 


854 


/ 


0 


6 


11 


16 


21 


31 



The eieio instruction provides an ordering function for the effects of 
loads and stores executed by a processor. Executing an eieio instruction 
ensures that all applicable loads and stores previously initiated by the 
processor are complete with respect to main storage before any applica- 
ble loads and stores subsequently initiated by the processor access main 
storage. 

eieio orders loads and stores to storage that is both Caching Inhibited 
and Guarded, and stores to storage that is Write Through Required. It 
orders all these loads and stores as a single set (i.e., there is not one order 
for loads and stores to Caching Inhibited and Guarded storage, and 
another order for stores to Write Through Required storage), eieio does 
not affect the order of other data accesses, or of cache operations 
(whether caused explicitly by execution of a Cache Management instruc- 
tion or implicitly by the cache coherence mechanism). 

Special Registers Altered 

None 
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The Time Base (TB) is a 64-bit register (see Figure 48) containing a 64-bit 
unsigned integer that is incremented periodically. Each increment adds 1 
to the low-order bit (bit 63). The frequency at which the integer is 
updated is implementation-dependent. 



TBU 


TBL 


0 




32 




63 


Field 


Description 








TBU 
TBL 


Upper 32 bits of Time Base 
Lower 32 bits of Time Base 







Figure 48. Time Base 

The Time Base increments until its value becomes 
OxFFFF_FFFF_FFFF_FFFF (2^^ - 1). At the next increment, its value 
becomes OxOOOO_0000_0000_0000. There is no explicit indication (such 
as an interrupt: see Book III, Chapter 5, "Interrupts," on page 453) that 
this has occurred. 

The period of the Time Base depends on the driving frequency. As an 
order of magnitude example, suppose that the CPU clock is 100 MHz 
and that the Time Base is driven by this frequency divided by 32. Then 
the period of the Time Base would be 

T = ^ ^ ^ lO^^seconds 

100 MHz 



which is approximately 187,000 years. 



352 



Chapter 4 Time Base 



Programming Note 

If the operating system 
initializes the Time Base 
on power-on to some 
reasonable value and the 
update frequency of the 
Time Base is constant, the 
Time Base can be used as 
a source of values that 
increase at a constant 
rate, such as for time 
stamps in trace entries. 

Even if the update 
frequency is not constant, 
values read from the 
Time Base are 
monotonically increasing 
(except when the Time 
Base wraps from 2^^-1 to 
0). If a trace entry is 
recorded each time the 
update frequency 
changes, the sequence of 
Time Base values can be 
post-processed to 
become actual time 
values. 



The PowerPC Architecture does not specify a relationship between the 
frequency at which the Time Base is updated and other frequencies, such 
as the CPU clock or bus clock, in a PowerPC system. The Time Base 
update frequency is not required to be constant. What is required, so 
that system software can keep time of day and operate interval timers, is 
one of the following. 

■ The system provides an (implementation-dependent) interrupt to soft- 
ware whenever the update frequency of the Time Base changes, and a 
means to determine what the current update frequency is. 

■ The update frequency of the Time Base is under the control of the sys- 
tem software. 



4.1 Time Base Instructions 

Extended mnemonics 

A pair of extended mnemonics is provided for the mftb instruction so 
that it can be coded with the TBR name as part of the mnemonic rather 
than as a numeric operand. See Book III, Appendix B, "Assembler 
Extended Mnemonics," on page 495. 



Move From Time Base XFX-form 

mftb RT,TBR 





31 


RT 




tbr 




371 


/ 


0 




6 


11 




21 




31 



n <r- tbr5.9 II tbro:4 
if n = 268 then 

if (64-bit implementation) then RT <- TB 

el se RT «- TB32:63 
else 1f n = 269 then 

if (64-bit implementation) then RT <- ^^0 || TBqjsi 

else RT TBo:3i 

The TBR field denotes either the Time Base or Time Base Upper, 
encoded as shown in Figure 49 on page 353. The contents of the desig- 
nated register are placed into register RT. When reading Time Base 
Upper on a 64-bit implementation, the high-order 32 bits of register RT 
are set to zero. 
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decimal 


TBR* 

tbr5.9 tbro.4 


Register name 


Privileged 


268 


01000 


01100 


TB 


no 


269 


01000 


01101 


TBU 


no 



Note that the order of the two 5-bit halves of the TBR number is reversed. 

Figure 49. TBR encodings for mftb 

If the TBR field contains any value other than one of the values shown 
above then one of the follow^ing occurs. 

■ The system illegal instruction error handler is invoked. 

■ The system privileged instruction error handler is invoked. 

■ The results are boundedly undefined. 

Special Registers Altered 

None 

Extended Mnemonics: 

Extended mnemonics for Move From Time Base: 
Extended: Equivalent to: 



mftb Rt 
mftbu Rt 



mftb Rt,268 
mftb Rt,269 



4.2 Reading the Time Base on 64-bit 
implementations 

The contents of the Time Base may be read into a GPR by the mftb 
extended mnemonic. To read the contents of the Time Base into register 
Rx, execute: 

mftb Rx 

Reading the Time Base has no effect on the value it contains or on the 
periodic incrementing of that value. 



Programming Note 

mftb serves as both a 
basic and an extended 
mnemonic. The 
assembler will recognize 
an mftb mnemonic with 
two operands as the basic 
form, and an mftb 
mnemonic with one 
operand as the extended 
form. Another way of 
saying this is that if mftb 
is coded with one 
operand, then that 
operand is assumed to be 
RT, and TBR defaults to 
the value corresponding 
to TB. 

Compiler and Assem- 
bler Note 

The TBR number coded in 
assembler language does 
not appear directly as a 
10-bit binary number in 
the instruction. The 
number coded is split 
into two 5-bit halves that 
are reversed in the 
instruction, with the 
high-order 5 bits 
appearing in bits 16:20 of 
the instruction and the 
low-order 5 bits in bits 
11:15. 
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4.3 Reading the Time Base on 32-bit 
implementations 

On 32-bit implementations, it is not possible to read the entire 64-bit 
Time Base in a single instruction. The mftb extended mnemonic moves 
from the lower half of the Time Base (TBL) to a GPR, and the mpbu 
extended mnemonic moves from the upper half (TBU) to a GPR. 

Because of the possibility of a carry from TBL to TBU occurring 
between reads of TBL and TBU, a sequence such as the following is nec- 





essary 


to read the Time Base 


on 


32-bit implementations. 


Programming Note 


1 oop: 










This sequence also works 




mf tbu 


Rx 


# 


1 cad from TBU 


correctly on a 64-bit 




mftb 


Ry 




1 oad from TBL 


implementation running 




mftbu 


Rz 




load from TBU 


in either 64- or 32-bit 




cmpw 


Rz,Rx 




see if 'old' = *new' 


mode. 




bne 


1 oop 


# 


loop if carry occurred 



The comparison and loop are necessary to ensure that a consistent pair 
of values has been obtained. 

4.4 Computing Time of Day from the 
Time Base 

Since the update frequency of the Time Base is implementation-depen- 
dent, the algorithm for converting the current value in the Time Base to 
time of day is also implementation-dependent. 

As an example, assume that the Time Base is incremented at a constant 
rate of once for every 32 cycles of a 100 MHz CPU instruction clock. 
What is wanted is the pair of 32-bit values comprising a POSIX standard 
clock: ^ the number of whole seconds that have passed since midnight 
January 0, 1970, and the remaining fraction of a second expressed as a 
number of nanoseconds. 

Assume that: 

■ The value 0 in the Time Base represents the start time of the POSIX 
clock (if this is not true, a simple 64-bit subtraction will make it so). 



^ Described in POSIX Draft Standard P1003.4/D12, Draft Standard for Information 
Technology — Portable Operating System Interface (POSIX) — Part 1: System Applica- 
tion Program Interface (API) — Amendment 1: Realtime Extension [C Language]. Insti- 
tute of Electrical and Electronics Engineers, Inc., February 1992. 



Book II PowerPC Virtual Environment Architecture 



4.4 Computing Time of Day from tiie Time Base 



355 



■ The integer constant ticks_per_sec contains the value 

1^^^ = 3,125,000 

which is the number of times the Time Base is updated each second. 

■ The integer constant ns_adj contains the value 

1,000,000,000 ^ 
3,125,000 

which is the number of nanoseconds per tick of the Time Base. 
64-bit Implementations 

The POSIX clock can be computed with an instruction sequence such as 
this: 



mftb 


Ry 








Ry 


= Time Base 


1 wz 


Rx, 


,ticks_ 


.per. 


_sec 




di vd 


Rz, 


. Ry . Rx 




# 


Rz 


= whol e seconds 


stw 


Rz, 


. posi x_ 


sec 








mul 1 d 


Rz, 


. Rz.Rx 






Rz 


= quotient * divisor 


sub 


Rz, 


. Ry , Rz 






Rz 


= excess ticks 


1 wz 


Rx, 


,ns_adj 










mul Id 


Rz. 


,Rz,Rx 






Rz 


= excess nanoseconds 


stw 


Rz, 


, posix_ 


ns 









32-bit Implementations 

On a 32-bit machine, direct implementation of the algorithm given above 
for 64-bit machines is awkward, due mainly to the difficulty of doing 64- 
bit division.^ Such division can be avoided entirely if a time of day clock 
in POSIX format is updated at least once each second. 
Assume that: 

■ The operating system maintains the following variables: 

— posix_tb (64 bits) 

— posix_sec (32 bits) 

— posix_ns (32 bits) 



^ See D. E. Knuth, The Art of Computer Programming, Volume 2, Semmumerical Algo- 
rithms, Section 4.3.1, Algorithm D. Addison-Wesley, 1981. 
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These variables hold the value of the Time Base and the computed 
POSIX second and nanosecond values from the last time the POSIX 
clock was computed. 

■ The operating system arranges for an interrupt (see Book III, Chapter 
5, "Interrupts," on page 453) to occur at least once per second, at 
which time it recomputes the POSIX clock values. 

■ The integer constant billion contains the value 1,000,000,000. 

The POSIX clock can be computed with an instruction sequence such 
as this: 



mf tbu 


Rx 


# 


Rx = TBU 


mftb 


Ry 


# 


Ry = TBL 


mftbu 


Rz 




Rz = 'new' TBU val ue 


cmpw 


Rz,Rx 


# 


see if *old' = *new' 


bne 


1 oop 


# 


loop if carry occurred 


# now 


have 64 -bit 


TB 


in Rx and Ry 


1 wz 


Rz , pos1 x_tb+4 




sub 


Rz,Ry ,Rz 


# 


Rz = delta in ticks 


1 wz 


Rw , ns_adj 






mul 1 w 


Rz,Rz,Rw 


# 


Rz = del ta i n ns 


1 wz 


Rw , posi x_ns 






add 


Rz,Rz,Rw 


# 


Rz = new ns val ue 


1 wz 


Rw , b 1 1 1 1 0 n 






cmpw 


Rz,Rw 


# 


see if past 1 second 


bit 


nochange 




branch if not 


sub 


Rz,Rz,Rw 




adjust nanoseconds 


1 wz 


Rw,posix_sec 






add1 


Rw, Rw, 1 


# 


adjust seconds 


stw 


Rw,posix_sec 




store new seconds 


nochange: 








stw 


Rz ,posix_ns 




store new ns 


stw 


Rx , posi x_tb 




store new time base 


stw 


Ry ,pos1x_tb+4 





Note that the upper half of the Time Base does not participate in the 
calculation to determine the new POSIX time of day. This is correct as 
long as the time change does not exceed one second. 

Non-constant update frequency 

In a system in which the update frequency of the Time Base may change 
over time, it is not possible to convert an isolated Time Base value into 
time of day. Instead, a Time Base value has meaning only with respect to 
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the current update frequency and the time of day that the update fre- 
quency was last changed. Each time the update frequency changes, either 
the system software is notified of the change via an interrupt (see Book 
III, Chapter 5, "Interrupts," on page 453), or the change was instigated 
by the system software itself. At each such change, the system software 
must compute the current time of day using the old update frequency, 
compute a new value of ticks_per_sec for the new frequency, and save the 
time of day. Time Base value, and tick rate. Subsequent calls to compute 
time of day use the current Time Base value and the saved data. 
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Cross-Reference 
Changed POWER 
Mnemonics 




The following table lists the POWER instruction mnemonics that have 
been changed in the PowerPC Virtual Environment Architecture, sorted 
by POWER mnemonic. 

To determine the PowerPC mnemonic for one of these POWER mne- 
monics, find the POWER mnemonic in the second column of the table: 
the remainder of the line gives the PowerPC mnemonic and the page on 
which the instruction is described, as well as the instruction names. 

POWER mnemonics that have not changed are not listed. 



Page 


POWER 


PowerPC 












Mnemonic 


Instruction 


Mnemonic 


Instruction 


347 


dclz 


Data Cache Line Set to Zero 


dcbz 


Data Cache Block set to Zero 


346 


ics 


Instruction Cache Synchronize 


isync 


Instruction Synchronize 



New Instructions 




The following instructions in the PowerPC Virtual Environment Architec- 
ture are new: they are not in the POWER Architecture. They exist in all 
PowerPC implementations. 

dcbf Data Cache Block Flush 

dcbst Data Cache Block Store 

debt Data Cache Block Touch 

dcbtst Data Cache Block Touch for Store 

eieio Enforce In-order Execution of I/O 

icbi Instruction Cache Block Invalidate 

mftb Move From Time Base 



PowerPC Virtual i 
Environment Inst 




Form 


Opcode 


Mode Dep.^ 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


X 


31 


86 




349 


dcbf 


Data Cache Block Flush 


X 


31 


54 




348 


dcbst 


Data Cache Block Store 


X 


31 


278 




346 


debt 


Data Cache Block Touch 


X 


31 


246 




347 


dcbtst 


Data Cache Block Touch for Store 


X 


31 


1014 




347 


dcbz 


Data Cache Block set to Zero 


X 


31 


854 




350 


eieio 


Enforce In-order Execution of I/O 


X 


31 


982 




345 


icbi 


Instruction Cache Block Invalidate 


XL 


19 


150 




346 


isync 


Instruction Synchronize 


XFX 


31 


371 




352 


mftb 


Move From Time Base 



^ All instructions in the PowerPC Virtual Environment Architecture are 
mode-independent, except that if the instruction refers to storage when in 
32-bit mode, only the low-order 32 bits of the 64-bit effective address are 
used to address storage. 
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ment Architecture. It covers instruc- 
tions and facilities not available to the 
application programmer, affecting stor- 
age control, interrupts, and timing 
facilities. 
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Introduction 



1.1 Overview 

Chapter 1 of Book I, PowerPC User Instruction Set Architecture^ 
describes computation modes, compatibility with the POWER Architec- 
ture, document conventions, a general systems overview, instruction for- 
mats, and storage addressing. This chapter augments that description as 
necessary for the PowerPC Operating Environment Architecture. 

1.2 Compatibility with the POWER 
Architecture 

The PowerPC Architecture provides binary compatibility for POWER 
application programs, except as described in Book I, Appendix G, 
"Incompatibilities with the POWER Architecture," on page 271. Binary 
compatibility is not necessarily provided for privileged POWER instruc- 
tions. 

1.3 Document Conventions 

The notation and terminology used in Book I apply to this Book also, 
with the following substitutions: 

■ For "system alignment error handler" substitute "Alignment interrupt." 

■ For "system data storage error handler" substitute "Data Storage 
interrupt." 
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■ For "system error handler" substitute "interrupt." 

■ For "system floating-point assist error handler" substitute "Floating- 
Point Assist interrupt." 

■ For "system floating-point enabled exception error handler" substitute 
"Floating-Point Enabled Exception type Program interrupt." 

■ For "system floating-point unavailable error handler" substitute 
"Floating-Point Unavailable interrupt." 

■ For "system illegal instruction error handler" substitute "Illegal 
Instruction type Program interrupt." 

■ For "system instruction storage error handler" substitute "Instruction 
Storage interrupt." 

■ For "system privileged instruction error handler" substitute "Privi- 
leged Instruction type Program interrupt." 

■ For "system service program" substitute "System Call interrupt." 

■ For "system trap handler" substitute "Trap type Program interrupt." 

1.3.1 Definitions and Notation 

The definitions given in Book I are augmented by the following: 

■ The context of a program is the environment (e.g., privilege and relo- 
cation) in which the program executes. That context is controlled by 
the content of certain system registers, such as the MSR and SDRl, 
and of the address translation tables. 

■ An exception is an error, unusual condition, or external signal, that 
may set a status bit and may or may not cause an interrupt, depending 
upon whether or not the corresponding interrupt is enabled. 

■ An interrupt is the act of changing the machine state in response to an 
exception, as described in Chapter 5, "Interrupts," on page 453. 

■ A trap interrupt is an interrupt that results from execution of a Trap 
instruction. 

■ Hardware means any combination of hard-wired implementation, 
emulation assist, or interrupt for software assistance. In the last case, 
the interrupt may be to an architected location or to an implementa- 
tion-dependent location. Any use of emulation assists or interrupts to 
implement the architecture is described in Book IV, PowerPC Imple- 
mentation Features. 
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■ /, //, ///, . . . denotes a field that is reserved in an instruction, in a register, 
or in an architected storage table. 



1.3.2 Reserved Fields 



Some fields of certain storage tables may be written to automatically by 
hardware, e.g., Reference and Change bits in the Page Table. When the 
hardware writes to such a table, the following rules must be observed: 

■ No defined field other than the one(s) the hardware is specifically 
updating may be modified. 

■ Contents of reserved fields may be preserved by hardware or may be 
written as Os. No other changes to reserved fields may be made. 

The handling of reserved bits in status and control registers described 
in Book I, Section 1.5.2, "Reserved Fields," on page 8 applies here as 
well. The reader should be aware that reading and writing of some of 
these registers (e.g., the MSR) can occur as a side effect of processing an 
interrupt and of returning from an interrupt, as well as when requested 
explicitly by the appropriate instruction (e.g., mtmsr). 



Programming Note 

System software should 
initialize reserved fields 
in architected storage 
tables (Segment Table, 
Page Table) to Os and not 
keep data in them, as the 
fields may be assigned a 
meaning in some future 
version of the 
architecture. 



1.3.3 Description of instruction Operation 

The following augments the definitions given in Book I, Section 1.5.3, 
"Description of Instruction Operation," on page 8. 

Notation Meaning 
SEGREG(x) Segment Register x 



1.4 General Systems Overview 

The processor or processor unit contains the sequencing and process- 
ing controls for instruction fetch, instruction execution, and interrupt 
action. Instructions that the processing unit can execute fall into three 
classes: 

■ instructions executed in the Branch Processor 

■ instructions executed in the Fixed-Point Processor 

■ instructions executed in the Floating-Point Processor 
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BRANCH 
PROCESSOR 



INSTRUCTION 
CACHE 



FIXED- 
POINT 
PROCESSOR 



FLOATING- 
POINT 
PROCESSOR 



DATA 
CACHE 



MAIN MEMORY 



DIRECT MEMORY ACCESS 



Figure 50. Logical view of the PowerPC processor architecture 

Almost all instructions executed in the Branch Processor, Fixed-Point 
Processor, and Floating-Point Processor are nonprivileged and are 
described in Book I, PowerPC User Instruction Set Architecture. Book II, 
PowerPC Virtual Environment Architecture contains some cache man- 
agement instructions. Instructions related to the privileged state of the 
processor, control of processor resources, control of the storage hierar- 
chy, and all other privileged instructions are described here or in Book IV, 
PowerPC Implementation Features, 



1.5 Instruction Formats 

See Book I, Chapter 1, "Introduction," on page 3 for a description of the 
instruction formats and addressing. 



1.5.1 Instruction Fields 

The following augments the instruction fields described in Book I, Section 
1.7.1, "Instruction Fields," on page 19. 
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SPR (11:20) 

Field used to specify a Special Purpose Register for the mtspr and mfspr 
instructions. The encoding is described in Section 3.4.1, "Move to/ 
from System Register Instructions," on page 384. 

SR (12:15) 

Field used to specify one of the 16 Segment Registers. 

1.6 Exceptions 

The following augments the list, given in Book I, Section 1.10, "Excep- 
tions," on page 26, of exceptions that can be caused directly by the exe- 
cution of an instruction. 

■ the execution of a Load or Store instruction to a direct-store segment 
that causes a Direct-Store Error exception (Data Storage interrupt) 

■ the execution of a traced instruction (Trace interrupt) 

1.7 Synchronization 

The synchronization described in this section refers to the state of the 
processor that is performing the synchronization. 

1.7.1 Context Synchronization 

An instruction or event is "context synchronizing" if it satisfies the 
requirements listed below. Such instructions and events are collectively 
called "context synchronizing operations." Examples of context synchro- 
nizing operations include the sc instruction, the rfi instruction, and most 
interrupts. 

1 . The operation causes instruction dispatching (the issuance of instruc- 
tions by the instruction fetch mechanism to any instruction execution 
mechanism) to be halted. 

2. The operation is not initiated or, in the case of isync^ is not completed, 

until all instructions already in execution have completed to a point at 
which they have reported all exceptions they will cause. (If a storage 
access due to a previously initiated instruction may cause one or more 
Direct-Store Error exceptions, the determination of whether it does 
cause such exceptions is made before the operation is initiated.) 
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3. The instructions that precede the operation will complete execution in 
the context (privilege, relocation, storage protection, etc.) in which 
they were initiated. 

4- If the operation directly causes an interrupt (e.g., sc directly causes a 
System Call interrupt) or is an interrupt, the operation is not initiated 
until no exception exists having higher priority than the exception 
associated with the interrupt (see Section 5.8, "Interrupt Priorities," 
on page 475). 

5. The instructions that follow the operation will be fetched and exe- 
cuted in the context established by the operation. (This requirement 
dictates that any prefetched instructions be discarded, which in turn 
requires that any effects and side effects of speculatively executing 
them also be discarded. The only side effects of these instructions that 
are permitted to survive are those specified in Section 4.2.5, "Specula- 
tive Execution," on page 396.) 

A context synchronizing operation is necessarily execution synchroniz- 
ing; see Section 1.7.2, "Execution Synchronization." UnHke the sync 
instruction [see Book II, page 334 (under Synchronize)], a context syn- 
chronizing operation need not wait for storage-related operations to 
complete on other processors, nor for Reference and Change bits in the 
Page Table (see Section 4, "Storage Control," on page 391) to be 
updated. 

1.7.2 Execution Synchronization 

An instruction is "execution synchronizing" if all previously initiated 
instructions appear to have completed before the instruction is initiated 
or, in the case of sync and isync, before the instruction completes. Exam- 
ples of execution synchronizing instructions are sync (see Book I, 
page 80) and mtmsr. Also, all context synchronizing instructions (see Sec- 
tion 1.7.1) are execution synchronizing. 

Unlike a context synchronizing operation, an execution synchronizing 
instruction need not ensure that the instructions following that instruc- 
tion will execute in the context established by that instruction. This new 
context becomes effective sometime after the execution synchronizing 
instruction completes and before or at a subsequent context synchroniz- 
ing operation. 
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2.1 Branch Processor Overview 

This chapter describes the details concerning the registers and the privi- 
leged instructions implemented in the Branch Processor that are not cov- 
ered in Book I, PowerPC User Instruction Set Architecture. 



2.2 Branch Processor Registers 

2.2.1 Machine Status Save/Restore Register 0 

The Machine Status Save/Restore Register 0 (SRRO) is a 64-bit {32-bit} 
register. This register is used to save machine status on interrupts, and to 
restore machine status when a Return From Interrupt [rfi) instruction is 
executed. 

On interrupt, SRRO is set to the current or next instruction address. 
Thus if the interrupt occurs in 32-bit mode, the high-order 32 bits of 
SRRO are set to 0. When rfi is executed, the contents of SRRO are copied 
to the next instruction address (NIA), except that the high-order 32 bits 
of the NIA are set to 0 when returning to 32-bit mode. 
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Programming Note 

In some 

implementations, every 
instruction fetch when 
MSR|R=1, and every 
instruction execution 
requiring address 
translation when 
MSRdr=1, may have the 
side effect of modifying 
SRROandSRRI. For 
further details, see the 
Book IV, PowerPC 
Implementation Features, 
document for the 
implementation. 



SRRO 


// 


0 




61 


63 


0 




{29} 


{31} 


Figure 51. 


Save/Restore Register 0 







In general, SRRO contains either the instruction address that caused 
the interrupt or the instruction address to return to after an interrupt is 
serviced. 



2.2.2 Machine Status Save/Restore Register 1 

The Machine Status Save/Restore Register 1 (SRRl) is a 64-bit {32-bit} 
register. This register is used to save machine status on interrupts and to 
restore machine status when an rfi instruction is executed. 



SRRl 


0 




63 {31} 



Figure 52, Save/Restore Register 1 



In general, when an interrupt occurs, bits 33:36 and 42:47 {1:4 and 
10:15} of SRRl are loaded with information specific to the interrupt 
type, and bits 0:32, 37:41, and 48:63 {0, 5:9, and 16:31} of the MSR are 
placed into the corresponding bit positions of SRRl. 



2.2.3 Machine State Register 

The Machine State Register (MSR) is a 64-bit {32-bit} register. This regis- 
ter defines the state of the processor. On interrupt, the MSR bits are 
altered in accordance with Figure 80 on page 458. The MSR can also be 
modified by the mtmsr^ sc, and rfi instructions. It can be read by the 
mfmsr instruction. 



MSR 


0 




63 {31} 



Figure 53. Machine State Register 
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Below are shown the bit definitions for the Machine State Register. 
The notation "full function" on a reserved bit means that it is saved in 
SRRl when an interrupt occurs. The notation "partial function" means 
that it is not saved. 

Bit(s) Description 

0 64-bit mode (SY) 

0 The processor runs in 32-bit mode. 

1 The processor runs in 64-bit mode. 
1:32 {0} Reserved full function 

33:36(1:4} Reserved partial function 
37:41(5:9} Reserved full function 
42:44(10:12} Reserved partial function 
45(13} Power Management Enable {VOSff) 

0 Power management is disabled (normal operation mode). 

1 Power management is enabled (reduced power mode). 

Power management functions are implementation-dependent. For 
further descriptions of the effect of this bit, see the Book IV, 
PowerPC Implementation Features document for the implementa- 
tion. 

46(14} Implementation-Dependent Function 

See the Book IV, PowerPC Implementation Features document for 
the implementation. 

47 (15} Interrupt Little-Endian Mode (ILE) 

When an interrupt is taken, this bit is copied into MSRl£ to select 
the Endian mode for the context established by the interrupt. 

48(16} External Interrupt Enable [VJE) 

0 The processor is disabled against External and Decrementer 
interrupts. 

1 The processor is enabled to take an External or Decrementer 
interrupt. 
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49(17} Problem State (PR) 

0 The processor is privileged to execute any instruction. 

1 The processor can only execute the non-privileged instruc- 
tions. 

MSRpR also affects storage protection, as described in Chapter 4, 
"Storage Control," on page 391. 

50(18} Floating-Point Available (F?) 

0 The processor cannot execute any floating-point instructions, 
including floating-point loads, stores, and moves. 

1 The processor can execute floating-point instructions. 
51(19} Machine Check Enable (ME) 

0 Machine Check interrupts are disabled. 

1 Machine Check interrupts are enabled. 
52(20} Floating-Point Exception Mode 0 (FEO) 

See the description on page 378. 
53(21} Single-Step Trace Enable (SE) 

0 The processor executes instructions normally. 

1 The processor generates a Single-Step type Trace interrupt af- 
ter successfully completing the execution of the next instruc- 
tion (unless that instruction is rfi, which is never traced). 
Successful completion means that the instruction caused no 
other interrupt. See Book IV, PowerPC Implementation Fea- 
tures. 

Single-step tracing may not be present on all implementations. If 
the function is not implemented, MSRgg is treated as a reserved 
bit. 

54(22} Branch Trace Enable (BE) 

0 The processor executes branch instructions normally. 

1 The processor generates a Branch type Trace interrupt after 
completing the execution of a branch instruction, whether or 
not the branch is taken. See Book IV, PowerPC Implementa- 
tion Features, 

Branch tracing may not be present on all implementations. If the 
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function is not implemented, MSRre is treated as a reserved MSR Programming Note 



See the description on page 378. 

56 {24} Reserved full function 

57(25} Interrupt Prefix (W) 

In the following description, nnnnn is the offset of the interrupt. 
See Figure 81 on page 459. 

0 Interrupts are vectored to the real address OxOOOn_nnnn 
in 32-bit implementations and real address 
OxOOOO_0000_OOOn_nnnn in 64-bit implementations. 

1 Interrupts are vectored to the real address OxFFFn_nnnn 
in 32-bit implementations and real address 
OxFFFF_FFFF_FFFn_nnnn in 64 bit implementations. 

58(26} Instruction Relocate (IR) 

0 Instruction address translation is off. 

1 Instruction address translation is on. 
59(27} Data Relocate (DR) 

0 Data address translation is off. 

1 Data address translation is on. 
60:61 {28:29} Reserved full function 
62(30} Recoverable Interrupt (RI) 

0 Interrupt is not recoverable. 

1 Interrupt is recoverable. 

Additional information about the use of this bit is given in Sec- 
tions 5.4, "Interrupt Processing," on page 456, 5.5.1, "System 
Reset Interrupt," on page 457, and 5.5.2, "Machine Check Inter- 
rupt," on page 459. 

63(31} Little-EndianMode(LE) 

0 The processor runs in Big-Endian mode. 

1 The processor runs in Little-Endian mode. 



bit. 



POWER-compatible 
operating systems will 
probably write the value 
1 to bit 56{24}. 



55(23} Floating-Point Exception Mode 1 (FEl) 
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The Floating-Point Exception Mode bits FEO and FEl are interpreted 
as shown below. For further details see Book I, page 153. 

FEO FEl Mode 

0 0 Interrupts disabled 

0 1 Imprecise Nonrecoverable 

1 0 Imprecise Recoverable 
1 1 Precise 



2.3 Branch Processor Instructions 



2.3.1 System Linkage Instructions 

These instructions provide the means by which a program can call upon 
the system to perform a service, and by which the system can return from 
performing a service or from processing an interrupt. 

These instructions are context synchronizing, as defined in Section 
1.7.1, "Context Synchronization," on page 371. 

The System Call instruction is described in Book I, page 41, but only 
at the level required by an application programmer. A complete descrip- 
tion of this instruction appears below. 



System Call SC-form 



sc 



Compatibility Note 

For a discussion of 
POWER compatibility 
with respect to 
instruction bits 16:29, 
please refer to Book 1, 
Appendix G, 
"Incompatibilities with 
the POWER 
Architecture," on 
page 271. For 
compatibility with future 
versions of this 
architecture, these bits 
should be coded as zero. 



[Power mnemonic: svca] 



17 



III 


III 


6 


11 



/// 



16 



3031 



SRRO <-iea CIA + 4 
SRRl33:36 42:47{1:4 10:15} ^ 0 
SRRl0:32 37:41 48:63{0 5:9 16:31} 

MSR <r- new_value (see below) 

NIA <-iea base_ea + OxCOO (see below) 



MSRo:32 37:41 48:63 {0 5:9 16:31} 



The effective address of the instruction following the System Call 
instruction is placed into SRRO. Bits 0:32, 37:41, and 48:63 {0, 5:9, and 
16:31} of the MSR are placed into the corresponding bits of SRRl, and 
bits 33:36 and 42:47 {1:4 and 10:15} of SRRl are set to undefined values. 
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Then a System Call interrupt is generated. The interrupt causes the 
MSR to be altered as described in Section 5.5, "Interrupt Definitions," on 
page 457. 

The interrupt causes the next instruction to be fetched from offset 
OxCOO from the base real address indicated by the new setting of MSRjp 
This instruction is context synchronizing. 

Special Registers Altered 

SRRO SRRl MSR 



Return From Interrupt XL-form 

rfi 



19 


III 


III 


III 


50 


/ 


0 


6 


11 


16 


21 


31 



^5^0:32 37:41 48:63 {0 5:9 16:31} ^ SRRlo:32 37:41 48:63 {0 5:9 16:31} 
NIA ^iea SRROo^ei {0:29} II ObOO 

Bits 0:32, 37:41, and 48:63 {0, 5:9, and 16:31} of SRRl are placed 
into the corresponding bits of the MSR. If the new MSR value does not 
enable any pending exceptions, then the next instruction is fetched, under 
control of the new MSR value, from the address SRROo;6i{0:29}ll^t)^^ 
(32-bit implementations, and 64-bit implementations when SF=1 in the 
new MSR value) or ^^0||SRR032:6il|0b00 (64-bit implementations when 
SF=0 in the new MSR value). If the new MSR value enables one or more 
pending exceptions, the interrupt associated with the highest priority 
pending exception is generated; in this case the value placed into SRRO by 
the interrupt processing mechanism (see Section 5.4, "Interrupt Process- 
ing," on page 456) is the address of the instruction that would have been 
executed next had the interrupt not occurred. 

This instruction is privileged and context synchronizing. 

Special Registers Altered 

MSR 
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3.1 Fixed-Point Processor Overview 

This chapter describes the details concerning the registers and the privi- 
leged instructions implemented in the Fixed-Point Processor that are not 
covered in Book I, PowerPC User Instruction Set Architecture. 

3.2 PowerPC Special Purpose 
Registers 

The Special Purpose Registers are read and v^ritten via the mfspr (page 
387) and mtspr (page 384) instructions. The descriptions of these instruc- 
tions list the valid encodings of SPR numbers. Encodings not listed are 
reserved for future use or for use as implementation-specific registers. 

Most SPRs are defined in other parts of this book; see the index to 
locate those definitions. Some SPRs are specific to an implementation. See 
Appendix E, "Implementation-Specific SPRs," on page 501 and Book IV, 
PowerPC Implementation Features, 

3.3 Fixed-Point Processor Registers 
3.3.1 Data Address Register 

The Data Address Register (DAR) is a 64-bit {32-bit} register. See Section 
5.5.3, "Data Storage Interrupt," on page 460 and Section 5.5.6, "Align- 
ment Interrupt," on page 464. 
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When an interrupt that uses the DAR occurs, the DAR is set to the 
effective address associated with the interrupting instruction. If the inter- 
rupt occurs in 32-bit mode, the high-order 32 bits of the DAR are set 
to 0. 



DAR 


0 




63 {31} 



Figure 54. Data Address Register 



3.3.2 Data Storage Interrupt Status Register 

The Data Storage Interrupt Status Register (DSISR) is a 32-bit register 
that defines the cause of Data Storage and Alignment interrupts. See Sec- 
tion 5.5.3, "Data Storage Interrupt," on page 460 and Section 5.5.6, 
"Ahgnment Interrupt," on page 464. 

DSISR 

0 31 
Figure 55. Data Storage interrupt Status Register 

3.3.3 Software-Use SPRs 

SPRGO through SPRG3 are 64-bit {32-bit} registers provided for operat- 
ing system use. 

SPRGO 

SPRGl 

SPRG2 ~ 

SPRG3 

0 63 {31} 

Figure 56. Software-use SPRs 

The following list describes the conventional uses of SPRGO through 
SPRG3. 
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SPRGO Software may load a unique real address in this register to identify 
an area of storage reserved for use by the first-level interrupt han- 
dler. This area must be unique for each processor in the system. 

SPRGl This register may be used as a scratch register by the first-level in- 
terrupt handler to save the contents of a OPR. That GPR then can 
be loaded from SPRGO and used as a base register to save other 
GPRs to storage. 

SPRGl This register may be used by the operating system as needed. 
SPRG3 This register may be used by the operating system as needed. 



3.3.4 Processor Version Register 

The Processor Version Register is a 32-bit read-only register that contains 
a value identifying the specific version (model) and revision level of the 
PowerPC processor. The contents of the PVR can be copied to a GPR by 
the mfspr instruction. Read access to the PVR is privileged; write access is 
not provided. 



Version 



Revision 



0 16 31 



Figure 57. Processor Version Register 

The PVR contains two fields: 

Version A 16-bit number that uniquely determines a particular processor 
version and a version of the PowerPC Architecture. This number 
can be used to determine the version of a processor; it may not 
distinguish between different product models if more than one 
model uses the same processor. 

Revision A 16-bit number that distinguishes between various releases of 
a particular version, i.e., an Engineering Change level. 

The value of the Version portion of the PVR is assigned by the 
PowerPC Architecture process. The value of the Revision portion of the 
PVR is implementation-defined. 
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3.4 Fixed-Point Processor Privileged 
instructions 

3.4.1 Move to/from System Register 
Instructions 

The Move To Special Purpose Register and Move From Special Purpose 
Register instructions are described in Book I, Section 3.3.14, "Move to/ 
from System Register Instructions," on page 128, but only at the level 
available to an application programmer. In particular, no mention is 
made there of registers that can be accessed only in privileged state. A 
complete description of these instructions appears below. 

Extended mnemonics 

A set of extended mnemonics is provided for the mtspr and mfspr instruc- 
tions so that they can be coded with the SPR name as part of the mne- 
monic rather than as a numeric operand. See Appendix B, "Assembler 
Extended Mnemonics," on page 495. 



Move To Special Purpose Register XFX-form 

mtspr SPR,RS 





31 


RS 




spr 




467 


/ 


0 




6 


11 




21 




31 



n = sprs-g II spro:4 
if length(SPREG(n)) = 64 then 
SPREG(n) <r- (RS) 

el se 

SPREG(n) <- (RS)32:63{0:31} 

The SPR field denotes a Special Purpose Register, encoded as shown in 
Figure 58 on page 385. The contents of register RS are placed into the 
designated Special Purpose Register. For Special Purpose Registers that 
are 32 bits long, the low-order 32 bits of RS are placed into the SPR. 

For this instruction, SPRs TBL and TBU are treated as separate 32-bit 
registers; setting one leaves the other unaltered. 

spro=l if and only if writing the register is privileged. Execution of this 
instruction specifying a defined and privileged register when MSRpR=l 
will result in a Privileged Instruction type Program interrupt. 

If MSRpR=l, the only effect of executing this instruction with an SPR 
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number that is not shown in Figure 58 and has sprQ=l is to cause either 
an Illegal Instruction type Program interrupt or a Privileged Instruction 
type Program interrupt. For all other cases (MSRpR=0 or spro=0), if the 
SPR field contains any value that is not show^n in Figure 58 then either an 
Illegal Instruction type Program interrupt occurs or the results are bound- 
edly undefined. 

Special Registers Altered 

See Figure 58 



decimal 


SPR^ 
spr5:9 spro:4 


Register name 


Privileged 


1 


00000 00001 


XER 


no 


8 


00000 01000 


LR 


no 


9 


00000 01001 


CTR 


no 


18 


00000 10010 


DSISR 


yes 


19 


00000 10011 


DAR 


yes 


22 


00000 10110 


DEC 


yes 


25 


00000 11001 


SDRl 


yes 


26 


00000 11010 


SRRO 


yes 


27 


00000 11011 


SRRl 


yes 


272 


01000 10000 


SPRGO 


yes 


273 


01000 10001 


SPRGl 


yes 


274 


01000 10010 


SPRG2 


yes 


275 


01000 10011 


SPRG3 


yes 


280 


01000 11000 


ASR^ 


yes 


282 


01000 11010 


EAR 


yes 


284 


01000 11100 


TBL 


yes 


285 


01000 11101 


TBU 


yes 



Compiler and Assem- 
bler Note 

For the mtspr and mfspr 
instructions, the SPR 
number coded in 
assembler language does 
not appear directly as a 
10-bit binary number in 
the instruction. The 
number coded is split 
into two 5-bit halves that 
are reversed in the 
instruction, with the 
high-order 5 bits 
appearing in bits 16:20 of 
the instruction and the 
low-order 5 bits in bits 
11:15. This maintains 
compatibility with 
POWER SPR encodings, 
in which these two 
instructions have only a 
5-bit SPR field occupying 
bits 11:15. 



Figure 58. SPR encodings for mtspr 
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SPR^ 






For a discussion of 
softwa re 


decimal 


Register name 


Privileged 


synchronization 




spr5:9 spro:4 






rcC|Uir crnenis wricn 


528 


10000 10000 


IBATOU 


yes 


altering certain Special 








Purpose Registers, please 


529 


10000 10001 


IBATOL 


yes 


refer to Chapter 1, 








"Synchronization 


530 


10000 10010 


IBATIU 


yes 


Requirements for Special 










Registers and for 


531 


10000 10011 


IBATIL 


yes 


Lookaside Buffers/' on 










pclvjc HO J. 


532 


10000 10100 


IBAT2U 


yes 




533 


10000 10101 


IBAT2L 


yes 


Compatibility Note 








For a discussion of 


534 


10000 10110 


IBAT3U 


yes 


POWER compatibility 










with respect to SPR 


535 


10000 10111 


IBAT3L 


yes 


niimhpr^ nnt ^hnwn in 


536 


10000 11000 


DBATOU 




yes 


the instruction 








Hpcrrintinnc fnr mt^nr 


537 


10000 11001 


DBATOL 


yes 


and mfspr, please refer to 








Book 1, Appendix G, 


538 


10000 11010 


DBATIU 


yes 


"Incompatibilities with 








the POWER 


539 


10000 11011 


DBATIL 


yes 


Architecture/' on 










page 271. For 


540 


10000 11100 


DBAT2U 


yes 


compatibility with future 










versions of this 


541 


10000 11101 


DBAT2L 


yes 


architecture, only SPR 






DBAT3U 




numbers discussed in 


542 


10000 11110 


yes 


these instruction 


543 


10000 11111 


DBAT3L 


yes 


descriptions should be 



^Note that the order of the two 5-bit halves of the SPR number is reversed. 
^64-bit implementations only. 



Figure 58. SPR encodings for mtspr 
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Move From Special Purpose Register XFX-form 

mfspr RT,SPR 





31 


RT 




spr 




339 


/ 


0 




6 


11 




21 




31 



Compiler/ Assembler/ 
Compatibility Notes 

See the Notes that 
appear with mtspr. 



n <r- spr5,9 II spro:4 
if length(SPREG(n)) = 64 then 
RT <r- SPREG(n) 

el se 

RT f- II SPREG(n) 

The SPR field denotes a Special Purpose Register, encoded as shown in 
Figure 59. The contents of the designated Special Purpose Register are 
placed into register RT. For Special Purpose Registers that are 32 bits 
long, the low-order 32 bits of RT receive the contents of the Special Pur- 
pose Register and the high-order 32 bits of RT are set to zero. 

spro=l if and only if reading the register is privileged. Execution of this 
instruction specifying a defined and privileged register when MSRpR=l 
will result in a Privileged Instruction type Program interrupt. 

If MSRpR=l, the only effect of executing this instruction with an SPR 
number that is not shown in Figure 59 and has spro=l is to cause either 
an Illegal Instruction type Program interrupt or a Privileged Instruction 
type Program interrupt. For all other cases (MSRpR=0 or spro=0), if the 
SPR field contains any value that is not shown in Figure 59 then either an 
Illegal Instruction type Program interrupt occurs or the results are bound- 
edly undefined. 



Special Registers Altered 

None 



decimal 


SPRl 
spr5:9 sprO:4 


Register name 


Privileged 


1 


00000 00001 


XER 


no 


8 


00000 01000 


LR 


no 


9 


00000 01001 


CTR 


no 


18 


00000 10010 


DSISR 


yes 


19 


00000 10011 


DAR 


yes 


22 


00000 10110 


DEC 


yes 



Figure 59. SPR encodings for mfspr 
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decimal 


SPR^ 
spr5:9 spr0:4 


Register name 


Privileged 


25 


00000 11001 


SDRl 


yes 


26 


00000 11010 


SRRO 


yes 


27 


00000 11011 


SRRl 


yes 


272 


01000 10000 


SPRGO 


yes 


273 


01000 10001 


SPRGl 


yes 


274 


01000 10010 


SPRG2 


yes 


275 


01000 10011 


SPRG3 


yes 


280 


01000 11000 


ASR^ 


yes 


282 


01000 11010 


EAR 


yes 


287 


01000 11111 


PVR 


yes 


528 


10000 10000 


IBATOU 


yes 


529 


10000 10001 


IBATOL 


yes 


530 


10000 10010 


IBATIU 


yes 


531 


10000 10011 


IBATIL 


yes 


532 


10000 10100 


IBAT2U 


yes 


533 


10000 10101 


IBAT2L 


yes 


534 


10000 10110 


IBAT3U 


yes 


535 


10000 10111 


IBAT3L 


yes 


536 


10000 11000 


DBATOU 


yes 


537 


10000 11001 


DBATOL 


yes 


538 


10000 11010 


DBATIU 


yes 


539 


10000 11011 


DBATIL 


yes 


540 


10000 11100 


DBAT2U 


yes 


541 


10000 11101 


DBAT2L 


yes 


542 


10000 11110 


DBAT3U 


yes 


543 


10000 11111 


DBAT3L 


yes 



^Note that the order of the two 5-bit halves of the SPR number is reversed. 
^64-bit implementations only. 

Moving from the Time Base (TB and TBU) is accomplished with the mfth instruction, 
described in Book II, Section 4.1, "Time Base Instructions," on page 352. 
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Move To Machine State Register X-form 

mtmsr RS 



31 


RS 


III 


III 


146 


/ 


0 


6 


11 


16 


21 


31 



MSR 



(RS) 



The contents of register RS are placed into the MSR. 

This instruction is privileged and execution synchronizing. 

In addition, alterations to the EE and RI bits are effective as soon as 
the instruction completes. Thus if MSR££=0 and an External or Decre- 
menter interrupt is pending, executing an mtmsr instruction that sets 
MSR££ to 1 will cause the External or Decrementer interrupt to be taken 
before the next instruction is executed if no higher priority exception 
exists (see Section 5.8, "Interrupt Priorities," on page 475). 



Programming Note 

For a discussion of 
software 
synchronization 
requirements wiien 
altering certain MSR bits, 
please refer to Chapter 7, 
"Synchronization 
Requirements for Special 
Registers and for 
Lookaside Buffers/' on 
page 483. 



Special Registers Altered 

MSR 



Move From Machine State Register X-form 

mfmsr RT 



31 


RT 


III 


III 


83 


/ 


0 


6 


11 


16 


21 


31 



RT <r- MSR 

The contents of the MSR are placed into register RX 
This instruction is privileged. 

Special Registers Altered 

None 
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4.1 Storage Addressing 

A program references storage using the effective address computed by the 
processor when it executes a load, store, branch, or cache instruction, 
and when it fetches the next sequential instruction. The effective address 
is translated to a real address according to procedures described in Sec- 
tion 4.3, "Address Translation Overview," on page 399 and following. 
The real address is what is sent to the memory subsystem. See Figure 60 
on page 400. 

For a complete discussion of storage addressing and effective address 
calculation, see Book I, Section 1.11, "Storage Addressing," on page 27. 

Storage Control Overview 

■ Page size is 2^^ bytes (4 KB) 

■ Segment size is 2^^ bytes (256 MB) 

■ 64-bit implementations: 

— Maximum real memory size is 2^^ bytes (16 EB) 

— Effective Address Range is 2^"^ 

— Virtual Address Range is 2^^ 

— Number of segments is 2^^ 

■ 32-bit implementations: 

— Maximum real memory size is 2^^ bytes (4 GB) 
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— Effective Address Range is 1^ 

— Virtual Address Range is 2^^ 

— Number of segments is 1^^ 

■ There are two types of storage segments, based on the state of the T 
bit in the Segment Table Entry or Segment Register selected by the 
effective address; 

— T=0: Ordinary segment 

— T=l: Direct-store segment 



4.2 Storage Model 

The storage model provides the following features: 

1 . The architecture allows the storage implementations to take advantage 
of the performance benefits of weak ordering of storage access 
between processors or between processors and devices. 

2. The architecture provides instructions that allow the programmer to 
ensure a consistent and ordered storage state. 

■ dcbf ■ Iwarx 

■ dcbst ■ eieio 

■ dcbz ■ stdcx, 

■ icbi ■ stwcx. 

■ isync ■ sync 

■ Idarx ■ tblsync 

3. Processor ordering: storage accesses by a single processor appear to 
complete sequentially from the view of the programming model but 
may complete out of order with respect to the ultimate destination in 
the storage hierarchy. Order is guaranteed at each level of the storage 
hierarchy for accesses to the same address from the same processor. 

4. Storage consistency between processors and between a processor and 
I/O is controlled by software through mode bits in the Page Table or 
BAT register. See Section 4.8.2, "Supported Storage Modes," on 
page 43 1 . Six modes are supported using the control bits: 

■ Write Through 

■ Caching Inhibited 
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■ Memory Coherence 



4.2.1 Storage Segments 

Storage is divided into 256 MB (2^^) segments. These segments can be of 
two types: 

■ ordinary segment 

Address translation is controlled by the setting of the relocate bits 
MSRdr for data and MSRir for instructions. MSRjr and MSRj)r are 
independent bits and may be set differently. The state of these bits may 
be changed by interrupts or by executing the appropriate instructions. 
An effective address in these segments represents a real or virtual 
address depending on the setting of the relocate bits of the MSR. 

■ direct-store segment 

Such segments may be used for access to I/O. Instruction fetch from 
direct-store segments is not allowed. MSRj)r must be 1 when access- 
ing data in a direct-store segment. See Section 4.6, "Direct-Store Seg- 
ments," on page 421 for an explanation of direct-store segments. 

The value of the T bit in the Segment Table Entry or Segment Register 
distinguishes between ordinary segments and direct-store segments. 



Programming Note 

It is possible to provide 
larger segments to 
application programs by 
using multiple adjacent 
segments. 



Segment type 



Ordinary segment 
Direct-store segment 



The T bit in the Segment Table Entry or Segment Register is ignored 
when fetching instructions with MSRir=0 or when accessing data with 
MSR£)R=0. Such accesses are not considered references to direct-store 
segments. 

See also Section 4.6, "Direct-Store Segments," on page 421. 



4.2.2 Storage Exceptions 

If the appropriate relocate bit in the MSR is set to 1, each effective 
address is translated to a real address before the storage access is per- 
formed. A storage exception occurs if the effective address is not trans- 
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lated by the Block Address Translation mechanism (see Section 4.7, 
"Block Address Translation," on page 423), and one of the following 
appHes: 

64-bit implementations: 

■ There is no valid entry in the Segment Table for the segment 
specified by the effective address. 

■ The appropriate Segment Table entry is found, but there is no 
valid entry in the Page Table for the page specified by the effec- 
tive address. 

■ The appropriate Segment Table and Page Table entries are 
found, but the access is not allowed by the storage protection 
mechanism. 

32-bit implementations: 

■ There is no valid entry in the Page Table for the page specified 
by the effective address. 

■ The appropriate Page Table entry is found but the access is not 
allowed by the storage protection mechanism. 

Storage exceptions cause Instruction Storage interrupts and Data Stor- 
age interrupts that identify the address of the failing instruction. 

In certain cases a storage exception may result in the "restart" of (re- 
execution of at least part of) a load or store instruction. See Book II, Sec- 
tion 2.1, "Instruction Restart," on page 341. 

4.2.3 Instruction Fetch 

Instructions are fetched under control of MSRjr. When any context syn- 
chronizing event occurs, any prefetched instructions are discarded and 
then refetched using the then-current state of MSRjr. 

MSRiR=0 

When instruction relocation is off, MSRir=0, the effective address is 
interpreted as described in Section 4.2.7, "Real Addressing Mode," on 
page 399. 

MSRiR=l 

Instructions are fetched using the address translated by one of the fol- 
lowing mechanisms: 
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1 - Segmented Address Translation Mechanism 
2. Block Address Translation Mechanism 

Instruction fetch from direct-store segments is not supported. An 
attempt to execute an instruction in a direct-store segment will result in 
an Instruction Storage interrupt. 

implicit Brancii 

Explicitly altering certain MSR bits (using mtmsr), or explicitly altering 
Segment Table Entries, Page Table Entries, or certain system registers, 
may have the side effect of changing the addresses, effective or real, from 
w^hich the current instruction stream is being fetched. This side effect is 
called an implicit branch. For example, an mtmsr instruction that 
changes the value of MSRgp may change the effective addresses from 
w^hich the current instruction stream is being fetched. The MSR bits and 
system registers for w^hich alteration can cause an implicit branch are 
indicated as such in Chapter 7, "Synchronization Requirements for Spe- 
cial Registers and for Lookaside Buffers," on page 483. Implicit branches 
are not supported by the Pow^erPC Architecture. If an implicit branch 
occurs, the results are boundedly undefined. 

4.2.4 Data Storage Access 

Data accesses are controlled by MSRj)r. When the state of MSR^r 
changes, subsequent accesses are made using the new state of MSR^r. 

MSRdr=0 

When data relocation is off, MSRdr=0, the effective address is inter- 
preted as described in Section 4.2.7, "Real Addressing Mode," on 
page 399. 

MSRdr=1 

When address relocation is on, MSR£)r=1, the effective address is 
translated by one of the following mechanisms: 

1 . Segmented Address Translation Mechanism 

2. Block Address Translation Mechanism 

3. Direct-Store Segment Translation Mechanism 
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4.2.5 Speculative Execution 

Data Access 

A speculative operation is one that a program "might" perform and that 
the hardware decides to execute out of order on the speculation that the 
result will be needed. If subsequent events indicate that the speculative 
instruction would not have been executed, the processor abandons any 
result the instruction produced. Typically, hardware executes instructions 
speculatively when it has resources that would otherwise be idle, so that 
the operation is done without cost or almost so. 

Most operations can be performed speculatively, as long as the 
machine appears to follow a simple sequential model such as that pre- 
sented in Book I, Section 2.2, "Instruction Fetching," on page 31, Certain 
speculative operations are not permitted; 

■ A speculative store may not be performed in such a manner that the 
alteration of the target location can be observed by other processors or 
mechanisms until it can be determined that the store is no longer spec- 
ulative. 

■ Speculative loads from Guarded storage (see below) are prohibited, 
except that if a load or store operation will be executed, the entire 
cache block(s) containing the referenced data may be loaded into the 
cache. 

■ No error of any kind other than Machine Check may be reported due 
to the speculative execution of an instruction, until such time as it is 
known that execution of the instruction is required. 

Speculative loads are allowed from any storage that is not Guarded. If 
a Machine Check exception results, a Machine Check interrupt may be 
generated even if the data access that caused the Machine Check excep- 
tion would not have been performed because a previous uncompleted 
operation would have changed the execution path. 

Only one side effect (other than Machine Check) of speculative execu- 
tion is permitted when a speculative instruction's result is abandoned; the 
Reference bit(s) in the referenced Page Table Entry(s) may be set due to a 
speculative load. 

Instruction Prefetch 

The processor typically fetches instructions ahead of the one(s) currently 
being executed in order to avoid delay. Such instruction prefetching is a 
speculative operation in that prefetched instructions may not be executed 
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due to intervening branches or interrupts. 

Most prefetching is permitted, as long as the machine appears to fol- 
low a simple sequential model such as that presented in Book I, Section 
2.2, "Instruction Fetching," on page 31. Certain prefetching is not per- 
mitted: 

■ Prefetching from Guarded storage (see below) is prohibited, except 
that if an instruction in a cache block will be executed, the entire cache 
block may be loaded into the cache. 

■ No error of any kind other than Machine Check may be reported due 
to instruction prefetching, until such time as the instruction that is the 
target of such prefetch becomes the instruction to be executed. 

Speculative instruction fetches are allowed from any storage that is not 
Guarded. If a Machine Check exception results, a Machine Check inter- 
rupt may be generated even if the instruction fetch that caused the 
Machine Check exception would not have been executed because a previ- 
ous uncompleted operation would have changed the execution path. 

Only one side effect (other than Machine Check) of instruction 
prefetching is permitted: the Reference bit(s) in the referenced Page Table 
Entry(s) may be set. 

Guarded Storage 

Storage is said to be "Guarded" if either (a) the G bit is 1 in the relevant 
PTE or DBAT register, or (b) MSR bit IR or DR is 0 for instruction 
fetches or data loads respectively. (In case (b) all of storage is Guarded.) 

Storage in a Guarded area may not be well behaved with regard to 
prefetching and other speculative storage operations. Such storage may 
represent an I/O device, and a speculative load or instruction fetch 
directed to such a device may cause the device to perform unexpected or 
incorrect operations. 

Storage addresses in a Guarded area may not have successors; that is, 
there may be "holes" in a Guarded area of the real address space. On any 
system, the highest real address has no successor. Lack of a successor 
address means that speculative sequential operations such as instruction 
prefetching may fail and may result in a Machine Check. 

As used below, "branch path" means the execution path as determined 
by Branch instructions. 

Load or Store Instruction 

A load or store instruction may not speculatively access Guarded storage 
unless one of the following conditions exists: 



Programming Note 

A Trap, sc, or rfi 
instruction will not 
necessarily prevent 
access to Guarded 
storage. When an access 
to Guarded storage is to 
be made based on some 
condition, the access 
must be protected by a 
Branch instruction. 
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Programming Note 

If the last instruction of a 
program is a Trap, sc, or 
rfi (which may be 
followed by data, 
uninitialized storage, or 
Instructions from a 
different program), the 
Trap/sc/rfi must be 
immediately followed by 
an unconditional branch 
to instructions that will 
not access Guarded 
storage (such as a branch 
to self). 

This will prevent 
speculative fetching or 
execution of the contents 
of the storage following 
the Trap/sc/rfi. Such 
speculative action could 
cause an access to 
Guarded storage, and the 
Trap/sc/rfi alone will not 
necessarily prevent such 
an access. 

Programming Note 

Treating the high-order 
32 bits of the effective 
address as zero 
effectively truncates the 
64-bit effective address 
to a 32-bit effective 
address such as would 
have been generated on 
a 32-bit implementation. 
Thus, for example, for 
Segmented Address 
Translation the ESID in 
32-bit mode is the high- 
order four bits of this 
truncated effective 
address; the ESID thus lies 
in the range 0:15. These 
four bits would select a 
Segment Register on a 
32-bit implementation; 
they select one of 16 



1. The target storage location is in a cache. In this case, the location may 
be accessed in the cache or in main storage. 

2. The target storage is Caching Allowed (1=0) and it is guaranteed that 
the load or store is on the branch path that will be executed (in the 
absence of any intervening interrupts). In this case, the entire cache 
block containing the target storage location may be loaded into the 
cache. 

3. The target storage is Caching Inhibited (1=1), the load or store is on 
the branch path that will be executed, and no prior instructions can 
cause an interrupt. 

Instruction Fetch 

Instructions may not be speculatively fetched from Guarded storage 
unless one of the following conditions exists: 

1 . The target storage location is in a cache. In this case, the location may 
be accessed in the cache or in main storage. 

2. MSRiR=l and an instruction has previously been fetched from the 
page. 

3. It is guaranteed that the instruction to be fetched is on the branch path 
that will be taken (in the absence of any intervening interrupts). If 
MSRiR=0, only the cache block containing the target instruction may 
be fetched. 



4.2.6 32-Bit Mode on a 64-Bit 
implementation 

The computation of the 64-bit effective address is independent of mode. 
When a 64-bit implementation executes in 32-bit mode (MSRsp=0), the 
high-order 32 bits of the 64-bit effective address are treated as zero for 
the purpose of addressing storage. This applies to both data accesses and 
instruction fetches. It applies when address translation is disabled, and to 
both translation modes (Segmented Address Translation and Block 
Address Translation) when address translation is enabled. This trunca- 
tion of the EA is the only respect in which storage accesses are mode- 
dependent. 
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4.2.7 Real Addressing Mode 

Whether address translation is enabled is controlled by MSRjr for 
instruction fetching and by MSR^r for data loads and stores. If address 
translation is disabled for a particular access (fetch, load, or store), the 
effective address is treated as the real address and is passed directly to the 
memory subsystem. 

The EA is a 64-bit {32-bit} quantity computed by the CPU. The width 
of the real address supported by a particular implementation will be less 
than or equal to this quantity. If it is less, the high-order bits of the EA are 
ignored when the real address is formed. 

Accesses in real mode bypass all storage protection checks (see Section 
4.10) and do not cause the recording of reference and change information 
(see Section 4.9). Real mode data accesses are performed as though the 
storage access mode bits "WIMG" were 0011 (see Section 4.8). Real 
mode instruction fetches are performed as though the "WIMG" bits were 
either 0001 or 0011. 

Access to direct-store segments (see Section 4.6) is not possible when 
translation is disabled, as Segment Table Entries (see "Segment Table," 
on page 404) or Segment Registers (see "Segment Registers," on 
page 413) are not checked for a T=l specification. 



STEGs in the Segment 
Table on a 64-bit 
implementation. These 
STEGs can be used to 
emulate the 32-bit 
implementation's 
Segment Registers. 



Warning: An attempt to fetch from, load from, or store to a real 
address that is not physically present in the machine may result in a 
Machine Check interrupt or a Checkstop (see Section 5.5.2, "Machine 
Check Interrupt," on page 459). 



4.3 Address Translation Overview 

Figure 60 on page 400 gives an overview of the address translation pro- 
cess on PowerPC. 

The effective address (EA) is the address generated by the processor 
for load and store instructions or for instruction fetch. This address is 
passed simultaneously to two translation mechanisms: 

■ Segmented Address Translation, described in Section 4.4 on page 401 
for 64-bit implementations and in Section 4.5 on page 412 for 32-bit 
implementations, and 

■ Block Address Translation, described in Section 4.7 on page 423. 

A typical effective address will be successfully translated by just one of 
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Effective Address 



Segmented Address 
Translation 



Lookup in 
Segment Table 



Ordinary 
Segment 



Virtual Address 
Translation 



Lookup in 
Page Table 



Real Address 



Block Address 
Translation 



Match against 
BAT Registers 



Direct-Store 
Segment 



I/O Address 



Real Address 



Figure 60. PowerPC address translation 

these mechanisms. If neither mechanism is successful, a storage exception 
(see page 393) results. If both mechanisms are successful, Block Address 
Translation takes precedence. 

An effective address that translates successfully via the Segmented 
Address Translation mechanism (but not by the Block Address Transla- 
tion mechanism) is a reference to one of two types of segments: 

■ A direct-store segment, in which case the address is converted directly 
to an I/O address and is passed to the I/O subsystem for further action, 
or 

■ An ordinary segment, in which case the address is converted to a real 
address that is then used to access storage. 

Kn effective address that translates successfully via the Block Address 
Translation mechanism is converted directly to a real address that is then 
used to access storage. 



Book III PowerPC Operating Environment Architecture 



4.4 Segmented Address Translation, 64-Bit implementations 



401 



4.4 Segmented Address Translation, 
64-Bit implementations 

Figure 61 shows the steps involved in translating from an effective 
address to a real address on a 64-bit implementation. 



64-bit EA 



Effective Segment ID 



-36 



-16 



Page 



Lookup 



Segment 
Table 



80-bit VA 



Virtual Segment ID 



Lookup 



Page 
Table 



64-bit RA 



Real Page Number 



-52 



-12 n 



Byte 




J L 



12 n 



Byte 



Figure 61 . Address translation overview (64-bit implementations) 

If an access is translated by the Block Address Translation mechanism 
(BAT, see Section 4.7 on page 423), the BAT takes precedence and the 
results of segmented address translation are not used. If an access is not 
translated by a BAT, segmented address translation proceeds as follows. 

The effective address (EA) is a 64-bit quantity computed by the pro- 
cessor. Bits 0:35 of the EA are the Effective Segment ID (ESID); these are 
looked up in the Segment Table to produce a Virtual Segment ID (VSID). 
Bits 36:51 of the EA are the Page Number within the segment; these are 
concatenated with the VSID from the Segment Table to form the Virtual 
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Page Number (VPN). The VPN is looked up in the Page Table to produce 
a Real Page Number (RPN). Bits 52:63 of the EA are the byte offset 
within the page; these are concatenated with the RPN to form the real 
address (RA) that is used to access storage. 

If the processor is executing in 32-bit mode (MSRsp=0), the transla- 
tion process described above is followed except that the high-order 32 
bits of the 64-bit effective address (that is, bits 0:31 of the ESID) are 
forced to zero before the lookup in the Segment Table starts. Bits 32:35 of 
the EA, which are the high-order 4 bits of the lower 32 bits of the EA, 
thus constitute the ESID. 

If the selected Segment Table Entry identifies the segment as a direct- 
store segment, the Page Table is not referred to. Rather, translation con- 
tinues as described in Section 4.6, "Direct-Store Segments," on page 421. 

For ordinary segments the translation moves in two steps from effec- 
tive address to virtual address (which never exists as a specific entity but 
can be considered to be the concatenation of the VPN and byte offset), 
and from virtual address to real address. 

The first step in segmented address translation is to convert the effec- 
tive address to a virtual address, as described in Section 4.4.1 on 
page 402. The second step, conversion of the virtual address to a real 
address, is described in Section 4.4.2 on page 406. 



4.4.1 Virtual Address Generation, 64-Bit 
implementations 

Conversion of a 64-bit effective address to a virtual address is done by 
searching a hashed segment table pointed to by the Address Space Regis- 
ter, as shown in Figure 62 on page 403. 



Programming Note 

The values 0, 0x1000, and 
0x2000 cannot be used as 
Segment Table 
addresses, since these 
pages contain interrupt 
vectors. 



Address Space Register 

The ASR is shown in Figure 63 on page 404. This 64-bit special-purpose 
register holds the real address of the Segment Table. The Segment Table 
defines the set of segments than can be addressed at any one time. 

Access to the ASR is privileged. The ASR may be read or written by 
the mfspr and mtspr instructions. See "Move From Special Purpose 
Register XFX-form," on page 387 and "Move To Special Purpose 
Register XFX-form," on page 384. 
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Figure 62. Translation of 64-bit effective address to virtual address 
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Figure 63. Address Space Register 



Segment Table 

The Segment Table (STAB) is a one-page data structure that defines the 
mapping between Effective Segment IDs and Virtual Segment IDs. The 
STAB must be on a page boundary. 

The STAB contains 32 Segment Table Entry Groups (STEGs). An 
STEG contains 8 Segment Table Entries (STEs) of 16 bytes each; each 
STEG is thus 128 bytes long. STEGs are entry points for searches of the 
Segment Table. 

See Section 4.12, "Table Update Synchronization Requirements," on 
page 446 for the rules that software must follow when updating the Seg- 
ment Table. 

Segment Table Entry 

Each Segment Table Entry (STE) maps one ESID to one VSID. Additional 
information in the STE controls the STAB search process and provides 
input to the storage protection mechanism. Figure 64 on page 405 shows 
the layout of an STE. 

See Section 4.10, "Storage Protection," on page 436 for a discussion 
of the storage key bits. 

Segment Table Search 

An outline of the STAB search process is shown in Figure 62 on 
page 403. The detailed algorithm is as follows: 

1. Primary Hash: Bits 0:51 of the ASR are concatenated with bits 
31:35 of the effective address (the low 5 bits of the ESID) and with a 
field of seven Os to form the 64-bit real address of a Segment Table 
Entry Group. This operation, referred to as the "Primary STAB 
Hash," identifies a particular STEG, each of whose 8 STEs will be 
tested in turn. 

2. The first STE in the selected STEG is tested for a match with the EA. In 
order for a match to exist, the following must be true: 

■ STEv = 1 
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Figure 64. Segment Table Entry format 

■ STE£siD = EA0.35 

If a match is found, the STE search terminates successfully. 

3. Step 2 is repeated for each of the other 7 STEs in the STEG. The first 
matching STE terminates the search. If none of the 8 STEs match, the 
secondary hash must be tried. 

4- Secondary Hash: Bits 0:51 of the ASR are concatenated with the 
one's complement of bits 31;35 of the effective address and with a 
field of seven Os to form the 64- bit real address of a Segment Table 
Entry Group. This operation is referred to as the "Secondary STAB 
Hash." 

5. The first STE in the selected STEG is tested for a match with the EA. In 
order for a match to exist, the following must be true: 

■ STEv = 1 

■ STEgsiD = EAo:35 

If a match is found, the STE search terminates successfully. 

6. Step 5 is repeated for each of the other 7 STEs in the STEG. The first 
matching STE terminates the search. If none of the 8 STEs match, the 
search fails. 
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Programming Notes 

1 . Segment Table entries 
may or may not be 
cached in an SLB. 

2. Segment Table 
lookups are done using 
real addresses and 
storage access mode 
M=1 (Memory 
Coherence required). 

3. It is possible that the 
hardware implements 
two SLB arrays (one for 
data and one for 
instructions). In this 
case, the size, shape 
and values contained 
by the arrays may be 
different. 

4. The ASR must point to 
a valid Segment Table 
whenever address 
relocation is enabled 
(MSR|R=1 or MSRdr=1 
or both) and the 
effective address is not 
covered by BAT 
translation. 

5. Use the sibie or sibia 

instruction to ensure 
that the SLB no longer 
contains a mapping 
for a particular 
segment. 

6. See Chapter 7, 
"Synchronization 
Requirements for 
Special Registers and 
for Lookaside 
Buffers," on page 483, 
for the synchronization 
requirements that 
must be satisfied when 
a program changes 
the contents of the 
ASR. 

7. Hardware never 
modifies the Segment 
Table. 



If the Segment Table search succeeds, the Virtual Page Number (VPN) 
is formed by concatenating the VSID from the matching STE with bits 
36:51 of the effective address (the page number). The complete 80-bit vir- 
tual address (VA) is formed by concatenating the VPN with bits 52:63 of 
the EA (the byte offset). 

If the search fails, a page fault interrupt is taken. This will be an 
Instruction Storage interrupt or a Data Storage interrupt, depending on 
whether the effective address is for an instruction fetch or for data access. 

If the selected STE has T=l, the reference is to a direct-store segment. 
No reference is made to the Page Table; processing continues as described 
in Section 4.6, "Direct-Store Segments," on page 421. 

Segment Lookaside Buffer 

Conceptually, the Segment Table is searched by the address relocation 
hardware to translate every reference. For performance reasons, the hard- 
ware usually keeps a Segment Lookaside Buffer (SLB) that holds STEs 
that have recently been used. The SLB is searched prior to searching the 
Segment Table. As a consequence, when software makes changes to the 
Segment Table, it must perform the appropriate SLB invalidate opera- 
tions to maintain the consistency of the SLB with the tables. 



4.4.2 Virtual to Real Translation, 64-Bit 
Implementations 

Conversion of an 80-bit virtual address to a real address is done by 
searching a hashed page table located by SDRl as shown in Figure 65 on 
page 407. 

Generation of the 80-bit virtual address that is input to this stage of 
the translation process is described in Section 4.4.1, "Virtual Address 
Generation, 64-Bit Implementations," on page 402. 



Page Table 

The Hashed Page Table (HTAB) is a variable-sized data structure that 
defines the mapping between Virtual Page Numbers and Real Page Num- 
bers. The HTAB's size must be a power of 2, and its starting address must 
be a multiple of its size. 

The layout of the HTAB is similar to that of the Segment Table, except 
that the HTAB's size is variable while the STAB's size is exactly one page. 
The HTAB contains a number of Page Table Entry Groups (PTEGs). A 
PTEG contains eight Page Table Entries (PTEs) of 16 bytes each; each 
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Figure 65. 



Translation of 80-bit virtual address to 64-bit real address 
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PTEG is thus 128 bytes long. PTEGs are entry points for searches of the 
Page Table. 

See Section 4.12, "Table Update Synchronization Requirements," on 
page 446 for the rules that software must follow when updating the Page 
Table. 

Page Table Entry 

Each Page Table Entry (PTE) maps one VPN to one RPN. Additional 
information in the PTE controls the HTAB search process and provides 
input to the storage protection mechanism. Figure 66 shows the layout of 
a PTE. 
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Figure 66. Page Table Entry, 64-bit implementations 

The PTE contains an Abbreviated Page Index rather than the complete 
Page field. At least 1 1 of the low-order bits of the VPN are used in the 
hash function to select a PTEG. These bits are not repeated in the PTEs of 
that PTEG. 
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Page Table Size 

The number of entries in the Page Table directly affects performance 
because it influences the hit ratio in the Page Table and thus the rate of 
page fault interrupts. If the table is too small, it is possible that not all the 
virtual pages that actually have real pages assigned can be mapped via the 
Page Table. This can happen if too many hash collisions occur and there 
are more than 16 entries for the same primary/secondary pair of PTEGs. 
While this situation cannot be guaranteed not to occur for any size Page 
Table, making the Page Table larger than the minimum size will reduce 
the frequency of occurrence of such collisions. 

Storage Description Register 1 

The SDRl register is shown in Figure 67. 



Programming Note 

It is recommended that 
the number of PTEGs in 
the Page Table be at least 
one-half the number of 
real pages to be accessed. 

As an example, if the 
amount of real memory 
to be accessed is 2^^ bytes 
(2 GB), then we have 
231-12 ^2^9 real pages. 
The minimum 
recommended Page Table 
size would be 2^^ PTEGs, 
or 2^5 bytes (32 MB). 
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45 



/// 
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Bits Name Description 

0:45 HTABORG Real address of Page Table 
59:63 HTABSIZE Encoded size of Page Table 
All other fields are reserved. 



Figure 67. SDR1, 64-bit implementations 

The HTABORG field in SDRl contains the high-order 46 bits of the 
64-bit real address of the Page Table. The Page Table is thus constrained 
to lie on a 2^^ byte (256 KB) boundary at a minimum. At least 11 bits 
from the hash function (see Figure 65 on page 407) are used to index into 
the Page Table. The minimum size Page Table is 256 KB (2^^ PTEGs of 
128 bytes each). 

The Page Table can be any size 2^ where 18 < n < 46. As the Page 
Table size is increased, more bits are used from the hash to index into the 
table and the value in HTABORG must have more of its low-order bits 
equal to 0. 

The HTABSIZE field in SDRl contains an integer giving the number 
of bits from the hash that are used in the Page Table index. HTABSIZE is 
used to generate a mask of the form Ob00...011...1, which is a string of 
28 - HTABSIZE 0-bits followed by a string of HTABSIZE 1-bits. The 
1-bits determine which additional bits (beyond the minimum of 11) from 
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the hash are used in the index; HTABORG must have this same number 
of low-order bits equal to 0. See Figure 65 on page 407, 

Example: Suppose that the Page Table is 16,384 (2^"^) 128-byte PTEGs, 
for a total size of 2^^ bytes (2 MB). A 14-bit index is required. Eleven bits 
are provided from the hash to start with, so 3 additional bits from the 
hash must be selected. Thus the value in HTABSIZE must be 3 and the 
value in HTABORG must have its low-order 3 bits (bits 31:33 of SDRl) 
equal to 0. This means that the Page Table must begin on a 2^"^^^'*"^ = 2^^ 
= 2 MB boundary. 

Hashed Page Table Search 

An outline of the HTAB search process is shown in Figure 65 on 
page 407. The detailed algorithm is as follows: 

1- Primary Hash: A 39-bit hash value is computed by Exclusive ORing 
the low-order 39 bits of the VSID with a 39-bit value formed by con- 
catenating 23 bits of 0 with the page index. 

2. The 64-bit real address of a PTEG is formed by concatenating the fol- 
lowing values: 

■ Bits 0:17 of SDRl (the 18 high-order bits of HTABORG). 

■ Bits 0:27 of the value formed in step 1 ANDed with the mask gen- 
erated from bits 59:63 of SDRl (HTABSIZE) and then ORed with 
bits 18:45 of SDRl (the 28 low-order bits of HTABORG). 

■ Bits 28:38 of the value formed in step 1. 

■ A 7-bit field of Os. 

This operation, referred to as the "Primary HTAB Hash," identifies a 
particular PTEG, each of whose 8 PTEs will be tested in turn. 

3. The first PTE in the selected PTEG is tested for a match with VPN. In 
order for a match to exist, the following must be true: 

■ PTEh=0 

■ PTEv=l 

■ PTEvsiD=VAo:51 
a PTEapi=VA52:56 

If a match is found, the PTE search terminates successfully. 
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4. Step 3 is repeated for each of the other 7 PTEs in the PTEG. The first 
matching PTE terminates the search. If none of the 8 PTEs match, the 
secondary hash must be tried. 

5. Secondary Hash: A 39-bit hash value is computed by taking the 
one's complement of the Exclusive OR of the low-order 39 bits of the 
VSID with a 39-bit value formed by concatenating 23 bits of 0 with 
the page index. 

6. The 64-bit real address of a PTEG is formed by concatenating the fol- 
lowing values: 

■ Bits 0:17 of SDRl (the 18 high-order bits of HTABORG). 

■ Bits 0:27 of the value formed in step 5 ANDed with the mask gen- 
erated from bits 59:63 of SDRl (HTABSIZE) and then ORed with 
bits 18:45 of SDRl (the 28 low-order bits of HTABORG). 

■ Bits 28:38 of the value formed in step 5. 

■ A 7-bit field of Os. 

This operation is referred to as the "Secondary HTAB Hash." 

7. The first PTE in the selected PTEG is tested for a match with VPN. In 
order for a match to exist, the following must be true: 

■ PTEh=1 

■ PTEv=l 

■ PTEvsiD=VAo:51 

■ PTEapi=VA52:56 

If a match is found, the PTE search terminates successfully. 

8. Step 7 is repeated for each of the other 7 PTEs in the PTEG. The first 
matching PTE terminates the search. If none of the 8 PTEs match, the 
search fails. 

If the Page Table search succeeds, the content of the PTE that trans- 
lates the EA is returned. The real address (RA) is formed by concatenat- 
ing the RPN from the matching PTE with bits 52:63 of the effective 
address (the byte offset). 

If the search fails, a page fault interrupt is taken. This will be an 
Instruction Storage interrupt or a Data Storage interrupt, depending on 
whether the effective address is for an instruction fetch or for data access. 
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Programming Notes 

1. Page Table Entries may 
or may not be cached 
in a TLB. 

2. Page Table lookups are 
done using real 
addresses and storage 
access mode M=l 
(Memory Coherence). 

3. It is possible that the 
hardware implements 
two TLB arrays (one for 
data and one for 
instructions). In this 
case, the size, shape, 
and values contained 
by the arrays may be 
different. 

4. Use the tibie or tibia 

instruction to ensure 
that the TLB no longer 
contains a mapping 
for a particular page. 

5. Refer to Book IV, 
PowerPC 
Implementation 
Features, for the 
procedure to be used 
to invalidate the 
entire TLB. 



Translation Lookaside Buffer 

Conceptually, the Page Table is searched by the address relocation hard- 
ware to translate every reference. For performance reasons, the hardware 
usually keeps a Translation Lookaside Buffer (TLB) that holds PTEs that 
have recently been used. The TLB is searched prior to searching the Page 
Table. As a consequence, when software makes changes to the Page Table 
it must perform the appropriate TLB invalidate operations to maintain 
the consistency of the TLB with the Page Table. 



4.5 Segmented Address Translation, 
32-Bit implementations 

Figure 68 shows the steps involved in translating from an effective 
address to a real address on a 32-bit implementation. 
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Figure 68. Address translation overview (32-bit implementations) 
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If an access is translated by the Block Address Translation mechanism 
(BAT, see Section 4.7 on page 423), the BAT takes precedence and the 
results of segmented address translation are not used. If an access is not 
translated by a BAT, segmented address translation proceeds as follows. 

The effective address (EA) is a 32-bit quantity computed by the pro- 
cessor. Bits 0:3 of the EA are the Segment Register number. These are 
used to select a Segment Register, from which is extracted a Virtual Seg- 
ment ID (VSID). Bits 4:19 of the EA are the Page Number within the seg- 
ment; these are concatenated with the VSID from the Segment Register to 
form the Virtual Page Number (VPN). The VPN is looked up in the Page 
Table to produce a Real Page Number (RPN). Bits 20:31 of the EA are 
the byte offset within the page; these are concatenated with the RPN to 
form the real address (RA) that is used to access storage. 

If the selected Segment Register identifies the segment as a direct-store 
segment, the Page Table is not referred to. Rather, translation continues 
as described in Section 4.6, "Direct-Store Segments," on page 421. 

For ordinary segments the translation moves in two steps from effec- 
tive address to virtual address (which never exists as a specific entity but 
can be considered to be the concatenation of the VPN and byte offset) 
and from virtual address to real address. 

The first step in segmented address translation is to convert the effec- 
tive address to a virtual address, as described in Section 4.5.1. The second 
step, conversion of the virtual address to a real address, is described in 
Section 4.5.2 on page 415. 

4.5.1 Virtual Address Generation, 32-Bit 
implementations 

Conversion of a 32-bit effective address to a virtual address is done by 
using the four high-order bits of the EA to select a Segment Register, as 
shown in Figure 69 on page 414. 

Segment Registers 

The 16 32-bit Segment Registers are present only in 32-bit implementa- 
tions. Figure 70 on page 415 shows the layout of a Segment Register. The 
fields in the Segment Register are interpreted according to the value of bit 
0 (the T bit). 

If an access is translated by the Block Address Translation mechanism 
(BAT, see Section 4.7 on page 423), the BAT takes precedence and the 
results of translation using Segment Registers are not used. If an access is 
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Figure 69. Tlranslation of 32-bit effective address to virtual address 

not translated by a BAT, and T=0 in the selected Segment Register, the 
effective address is a reference to an ordinary segment. The 52-bit virtual 
address (VA) is formed by concatenating 



the 24-bit VSID jfield from the Segment Register, 
the 16-bit page index, EA4.19, and 
the 12-bit byte offset, EA20:3i. 



The VA is then translated to a real address as described in the next sec- 
tion. 
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If T=l in the selected Segment Register (and the access is not trans- 
lated by a BAT), the effective address is a reference to a direct-store seg- 
ment. No reference is made to the Page Table; processing continues as 
described in Section 4.6, "Direct-Store Segments," on page 421. 
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Figure 70. Segment Register format 

4.5.2 Virtual to Real Translation, 32-Bit 
Implementations 

Conversion of a 52-bit virtual address to a real address is done by search- 
ing a hashed page table located by SDRl, as shov^n in Figure 71 on 
page 416. 

Generation of the 52-bit virtual address that is input to this stage of 
the translation process is described in Section 4.5.1, "Virtual Address 
Generation, 32-Bit Implementations," on page 413. 



Book III PowerPC Operating Environment Architecture 



416 



Chapter 4 Storage Control 



52-Bit Virtual Address 



Virtual Page Number (VPN) 

24—, 16- 



Virtual Segment ID (VSID) 



23 24 

I I 



-12- 



Byte 



39 40 51 



— 3 
000 



-161 



Hash 
Function 





HTABORG 






HTABMASK 






16 




9 


SDRl 




, 00 


III 


00 011....1 



6 7 



15 



23 



-19 



31 0 



8 9 



18 



AND 

— I — 



OR 



Page Table 



8 bytes 



-10- 



6n 

000000 



PTEO 














PTE7 



















32-Bit Real Address of Page Table Entry Group 

Page Table Entry (PTE) 
8 bytes 



PTEGO 



PTEGn 



64 bytes 



V 




VSID 


24 


H 


6 

API 


0 


1 




24 


25 


26 31 



20 














Real Page Number 


/// 


R 


C 


WIMG 


/ 


PP 


0 19 

1 1 


23 


24 


25 28 




30 31 



32-Bit Real Address 



-20- 



RPN 



Byte 



Book III 



Figure 71 . Translation of 52-bit virtual address to 32-bit real address 
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Page Table 

The Hashed Page Table (HTAB) is a variable-sized data structure that 
defines the mapping between Virtual Page Numbers and Real Page Num- 
bers. The HTAB's size must be a power of 2, and its starting address must 
be a multiple of its size. 

The HTAB contains a number of Page Table Entry Groups (PTEGs). A 
PTEG contains eight Page Table Entries (PTEs) of eight bytes each; each 
PTEG is thus 64 bytes long. PTEGs are entry points for searches of the 
Page Table. 

See Section 4.12, "Table Update Synchronization Requirements," on 
page 446 for the rules that software must follow when updating the Page 
Table. 

Page Table Entry 

Each Page Table Entry (PTE) maps one VPN to one RPN. Additional 
information in the PTE controls the HTAB search process and provides 
input to the storage protection mechanism. Figure 72 shows the layout of 
a PTE. 
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Figure 72. Page Table Entry, 32-bit implementations 
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The PTE contains an Abbreviated Page Index rather than the complete 
Page field. At least 10 of the low-order bits of the Page field are used in 
the hash function to select a PTEG. These bits are not repeated in the 
PTEs of that PTEG. 

Page Table Size 

The number of entries in the Page Table directly affects performance 
because it influences the hit ratio in the Page Table and thus the rate of 
page fault interrupts. If the table is too small, it is possible that not all the 
virtual pages that actually have real pages assigned can be mapped via the 
Page Table. This can happen if too many hash collisions occur and there 
are more than 16 entries for the same primary/secondary pair of PTEGs. 
While this situation cannot be guaranteed not to occur for any size Page 
Table, making the Page Table larger than the minimum size will reduce 
the frequency of such collisions. 



Storage Description Register 1 

The SDRl register is shown in Figure 73. 
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Figure 73. SDRl, 32-bit implementations 

The HTABORG field in SDRl contains the high-order 16 bits of the 
32-bit real address of the Page Table. The Page Table is thus constrained 
to lie on a 2^^ byte (64 KB) boundary at a minimum. At least 10 bits 
from the hash function (see Figure 71 on page 416) are used to index into 
the Page Table. The minimum size Page Table is 64 KB (2^^ PTEGs of 64 
bytes each). 

The Page Table can be any size 2" where 16 < n < 25. As the table size 
is increased, more bits are used from the hash to index into the table and 
the value in HTABORG must have more of its low-order bits equal to 0. 
The HTABMASK field in SDRl contains a mask value that deter- 



Programming Note 

It is recommended that 
the number of PTEGs in 
the Page Table be at least 
one-half the number of 
real pages to be accessed. 

As an example, if the 
amount of real memory 
to be accessed is 2^^ bytes 
(512 MB), then we have 
229-12 ^2^7 real pages. 
The minimum 
recommended Page Table 
size would be 2^^ PTEGs, 
or 2^2 bytes (4 MB). 
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mines how many bits from the hash are used in the Page Table index. 
This mask must be of the form Ob00...011...1, that is, a string of 0-bits 
followed by a string of 1-bits. The 1-bits determine how many additional 
bits (beyond the minimum of 10) from the hash are used in the index; 
HTABORG must have this same number of low-order bits equal to 0. See 
Figure 71 on page 416. 

Example: Suppose that the Page Table is 8,192 (2^^) 64-byte PTEGs, 
for a total size of 2^^ bytes (512 KB). A 13-bit index is required. Ten bits 
are provided from the hash to start with, so 3 additional bits from the 
hash must be selected. Thus the value in HTABMASK must be 0x007 and 
the value in HTABORG must have its low-order 3 bits (bits 13:15 of 
SDRl) equal to 0. This means that the Page Table must begin on a 
23+10+6 ^ = 512 KB boundary. 

Hashed Page Table Search 

An outline of the HTAB search process is shown in Figure 71 on 
page 416. The detailed algorithm is as follows: 

1. A 19-bit hash value is computed by Exclusive ORing the low-order 19 
bits of the VSID with a 19-bit value formed by concatenating 3 bits of 
0 with the page index. 

2. Primary Hash: The 32-bit real address of a PTEG is formed by con- 
catenating the following values: 

■ Bits 0:6 of SDRl (the 7 high-order bits of HTABORG). 

■ Bits 0:8 of the value formed in step 1 ANDed with bits 23:31 of 
SDRl (the value of HTABMASK) and then ORed with bits 7:15 of 
SDRl (the 9 low-order bits of HTABORG). 

■ Bits 9:18 of the value formed in step 1. 

■ A 6-bit field of Os. 

This operation, referred to as the "Primary HTAB Hash," identifies a 
particular PTEG, each of whose 8 PTEs will be tested in turn. 

3. The first PTE in the selected PTEG is tested for a match with VPN. In 
order for a match to exist, the following must be true: 

■ PTEh=0 

■ PTEv=l 
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■ PTEvsiD=VAo:23 

■ PTEapi=VA24:29 

If a match is found, the PTE search terminates successfully. 

4. Step 3 is repeated for each of the other 7 PTEs in the PTEG. The first 
matching PTE terminates the search. If none of the 8 PTEs match, the 
secondary hash must be tried. 

5. A 19-bit hash value is computed by taking the one's complement of 
the Exclusive OR of the low-order 19 bits of the VSID with a 19-bit 
value formed by concatenating 3 bits of 0 with the page index. 

6. Secondary Hash: The 32-bit real address of a PTEG is formed by 
concatenating the following values: 

■ Bits 0:6 of SDRl (the 7 high-order bits of HTABORG). 

■ Bits 0:8 of the value formed in step 5 ANDed with bits 23:31 of 
SDRl (the value of HTABMASK) and then ORed with bits 7:15 of 
SDRl (the 9 low-order bits of HTABORG). 

■ Bits 9:18 of the value formed in step 5. 

■ A 6-bit field of Os. 

This operation is referred to as the "Secondary HTAB Hash." 

7. The first PTE in the selected PTEG is tested for a match with VPN. In 
order for a match to exist, the following must be true: 

■ PTEh=1 

■ PTEv=l 

■ PTEvsiD=VAo:23 

■ PTEappVA24:29 

If a match is found, the PTE search terminates successfully. 

8. Step 7 is repeated for each of the other 7 PTEs in the PTEG. The first 
matching PTE terminates the search. If none of the 8 PTEs match, the 
search fails. 

If the Page Table search succeeds, the content of the PTE that trans- 
lates the EA is returned. The Real Address (RA) is formed by concatenat- 
ing the RPN from the matching PTE with bits 20:31 of the effective 
address (the byte offset). 



Book III PowerPC Operating Environment Architecture 



4.6 Direct-Store Segments 



421 



If the search fails, a page fault interrupt is taken. This will be an 
Instruction Storage interrupt or a Data Storage interrupt, depending on 
whether the effective address is for an instruction fetch or for data access. 

Translation Lookaside Buffer 

Conceptually, the Page Table is searched by the address relocation hard- 
ware to translate every reference. For performance reasons, the hardware 
usually keeps a Translation Lookaside Buffer (TLB) that holds PTEs that 
have recently been used. The TLB is searched prior to searching the Page 
Table. As a consequence, when software makes changes to the Page Table 
it must perform the appropriate TLB invalidate operations to maintain 
the consistency of the TLB with the Page Table. 



4.6 Direct-Store Segments 

A direct-store segment is a mapping of effective addresses onto an exter- 
nal address space, typically an I/O bus. 

Effective addresses that lie within direct-store segments complete only 
the first step of segmented address translation. 

■ In 64-bit implementations, this is the search of the Segment Table. If 
the resulting Segment Table Entry has T=l, the reference is to a direct- 
store segment. 

■ In 32-bit implementations, this is the selection of the Segment Register. 
If the Segment Register has T=l, the reference is to a direct-store seg- 
ment. 

Direct-store data accesses are performed as though the storage access 
mode bits "WIMG" were 0101 (see Section 4.8). 



4.6.1 Completion of Direct-Store Access 

If an access is translated by the Block Address Translation mechanism 
(BAT, see Section 4.7), the BAT takes precedence and the results of seg- 
mented address translation are not used. If an access is not translated by 
a BAT, and the segmented address translation process has discovered that 
the segment has T=l, translation terminates. No reference is made to the 
Page Table; Reference and Change bits are not updated. The following 
data are sent to the storage controller: 



Programming Notes 

1. Page Table Entries may 
or may not be cached 
in a TLB. 

2. Page Table lookups are 
done using real 
addresses and storage 
access mode M=1 
(Memory Coherence). 

3. It is possible that the 
hardware implements 
two TLB arrays (one for 
data and one for 
instructions). In this 
case, the size, shape, 
and values contained 
by the arrays may be 
different. 

4. Use the tibie or tibia 
instruction to ensure 
that the TLB no longer 
contains a mapping 
for a particular page. 

5. Refer to Book IV, 
PowerPC 
Implementation 
Features, for the 
procedure to be used 
to invalidate the 
entire TLB. 

Compatibility Note 

Direct-store segments 
are provided for POWER 
compatibility. 
Applications that require 
low-latency load/store 
access to an external 
address space should 
consider more 
traditional methods. 
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For 64-bit implementations: 

■ A one-bit field representing the privilege of the storage access, 
computed as follows: 

Key ^ (Kp & MSRp^) | (Kg & -.MSRpr) 

■ The 32-bit lO field from bits 32:63 of the second doubleword 
of the STE 

■ The low-order 28 bits of the effective address, 
For 32-bit implementations: 

■ A one-bit field representing the privilege of the storage access, 
computed as follows: 

Key <- (Kp & MSRpr) | (Kg & -iMSRpr) 

■ The contents of bits 3:31 of the Segment Register, which is the 
BUID field concatenated with the "controller specific" field 

■ The low-order 28 bits of the effective address, EA4.3^ 

An implementation of the PowerPC Architecture may cause multiple 
address/data transfers for a single instruction. The address for each trans- 
fer will be handled in the same manner that addresses for access to main 
storage are handled. 

4.6.2 Direct-Store Segment Protection 

Page-level protection as described in Section 4.10.1, "Page Protection," 
on page 437 is not provided by the PowerPC processor for direct-storage 
segments. The appropriate key bit (K^ or Kp) from the STE or Segment 
Register is sent to the storage controller, but it is up to the storage con- 
troller to implement any protection mechanism. Frequently no such 
mechanism will be provided; the fact that a direct-store segment is 
mapped into the address space of a process may be regarded as sufficient 
authority to access the segment. 

4.6.3 instructions Not Supported for T=1 

The following instructions are not supported when they specify an effec- 
tive address in a segment where T=l: 
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■ livarx 

■ Idarx 

■ eciwx 



■ stwcx. 

■ stdcx, 

■ ecowx 



If one of these instructions is executed specifying an effective address 
in a segment where T=l, either a Data Storage interrupt occurs or the 
results are boundedly undefined. 



4.6.4 Instructions with No Effect for T=1 



The following instructions are treated as no-ops when they specify an 
effective address in a segment where T=l: 



For further details of storage references to direct-store segments, refer 
to Book IV, PowerPC Implementation Features. 



The Block Address Translation (BAT) mechanism provides a means for 
mapping ranges of virtual addresses larger than a single page onto contig- 
uous areas of real storage. Such areas can be used for data that are not 
subject to normal virtual storage handling (paging), such as a memory- 
mapped display buffer or an extremely large array of numeric data. 

4.7.1 Recognition of Addresses in BAT Areas 

Block Address Translation is enabled only when address translation is 
enabled (MSRir=1 or MSRdr=1 or both). 

Special Purpose Registers (SPRs) called BAT registers define the start- 
ing addresses and sizes of BAT areas. The BAT registers are accessed in 
parallel with segmented address translation to determine whether a par- 
ticular EA corresponds to a BAT area. If an EA is within a BAT area, the 
real address for storage access is determined as described below. 

It is possible to set up the BAT registers and the segmented address 
translation mechanism such that a particular effective address is within a 
BAT area and also is covered by page translation. When this happens, the 
BAT takes precedence over entries in the Segment Table or the content of 
a Segment Register (including the T bit). 



Programming Note 

It is possible for a BAT 
area to overlay part of an 
ordinary segment, such 
that the BAT portion is 
nonpageable while the 
rest of the segment is 
pageable. If this is done, 
it is not necessary to 
supply Page Table Entries 
for the portion of the 
segment overlaid by the 
BAT. 



■ debt 

■ dcbtst 



■ dcbst 

■ dcbz 

■ icbi 



m dcbf 
■ dcbi 



4.7 Block Address Translation 
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Programming Note 

If the same storage 
address is to be mapped 
via BAT for both l-fetch 
and data load and store, 
it is necessary to load the 
mapping into both an 
IBAT pair and a DBAT 
pair. This is true even on 
an implementation that 
does not have split I and 
D caches. 



The BAT areas are defined by pairs of SPRs. These SPRs can be read or 
written by the mfspr and mtspr instructions; see pages 384 and 387. 
Access to these SPRs is privileged. The layout of the BAT registers is 
shown in Figure 74 on page 425 for 64 -bit implementations and in 
Figure 75 on page 426 for 32-bit implementations. 

Four pairs of BAT registers are provided for translating instruction 
addresses (the IBAT registers), and four pairs are provided for translating 
data addresses (the DBAT registers). 

It is an error for system software to set up the BAT registers such that 
an effective address is translated by more than one IBAT pair or by more 
than one DBAT pair. If this error occurs, the results are undefined and 
may include a violation of the storage protection mechanism, a Machine 
Check interrupt, or a Checkstop. 

Each pair of BAT registers defines the starting address of a BAT area in 
effective address space, the length of the area, and the start of the corre- 
sponding area in real address space. If an effective address is within the 
range of EAs defined by a pair of BAT registers that is valid (see below) 
for the access, its real address is developed by (conceptually) subtracting 
the starting effective address of the BAT area from the EA and adding the 
starting real address of the BAT area. 

BAT areas are restricted to a finite set of allowable lengths, all of 
which are powers of 2. The smallest BAT area defined is 128 KB (2^^ 
bytes). The largest BAT area defined is 256 MB (2^^ bytes). The starting 
address of a BAT area in both EA space and RA space must be a multiple 
of the area's length. 
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4.7.2 BAT Registers 

See Section 3.4.1, "Move to/from System Register Instructions," on 
page 384 for a list of the SPR numbers for the BAT registers. See Appen- 
dix B, "Assembler Extended Mnemonics," on page 495 for a Ust of 
extended mnemonics for use with the BAT registers. 

Upper BAT Register 



0 




46 




51 






62 


63 


BEPI 


/// 


BL 


Vs 




BRPN 


III 


WIMG 


/ 


PP 


0 




46 






57 60 




62 


63 



Lower BAT Register 



Register 
Upper 



Lower 



Bit(s) 


Name 


Description 


0:46 


BEPI 


Block Effective Page Index 


51:61 


BL 


Block Length 


62 


Vs 


Supervisor state valid bit 


63 


Vp 


Problem state valid bid 


0:46 


BRPN 


Block Real Page Number 


57:60 


WIMG 


Storage access controls 
Bit 60 is reserved in IBATs 


62:63 


PP 


Protection bits for BAT area 



All other fields are reserved. 



Figure 74. BAT registers, 64-bit implementations 
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Upper BAT Register 



0 




14 




19 






30 


31 


BEPI 


/// 


BL 


Vs 


Vp 


BRPN 


III 


WIMG 


/ 


PP 


0 




14 






25 28 




30 


31 



Lower BAT Register 



Register 


Bit(s) 


Name 


Description 


Upper 


0:14 


BEPI 


Block Effective Page Index 




19:29 


BL 


Block Length 




30 


Vs 


Supervisor state valid bit 




31 


Vp 


Problem state valid bid 


Lower 


0:14 


BRPN 


Block Real Page Number 




25:28 


WIMG 


Storage access controls 
Bit 28 is reserved in IBATs 




30:31 


PP 


Protection bits for BAT area 



All other fields are reserved. 



Figure 75. BAT registers, 32-bit impiementations 



Programming Note 

Entries that have 
complementary settings 
of Vj and Vp may map 
overlapping effective 
address blocks. 
Complementary settings 
would be: 



The equation for determining whether a BAT entry is vaHd for a par- 
ticular access is: 

BAT_entry_val1d = (Vg & ^MSRpr) | (Vp & MSRpr) 

If a BAT entry is not vaHd for a given access, it does not participate in 
address translation for that access. 

Two BAT entries may not map an overlapping effective address range 
and be valid at the same time. 

The BL field in the upper BAT register is a mask that encodes the 
length of the BAT area. 



BAT entry B: Vg = 0. Vp = 1 
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BAT Area Length 


BL 


128 KB 


000 0000 0000 


256KB 


000 0000 0001 


512 KB 


000 0000 0011 


1MB 


000 0000 0111 


2MB 


000 0000 1111 


4MB 


000 0001 1111 


8MB 


000 0011 1111 


16MB 


000 0111 1111 


32MB 


000 1111 1111 


64 MB 


001 1111 1111 


128MB 


oil 1111 1111 


256MB 


111 1111 1111 



Only the values shown are valid for BL. The rightmost bit of BL is 
aligned v^ith bit 46 {14} of the EA. 

An effective address is determined to be within a BAT area if EA 
matches BEPL The boundary between the string of Os and the string of Is 
in BL determines the bits of EA that participate in the comparison with 
BEPL A match occurs if the following expression is true on a 64-bit 
implementation: 

EAo:35 I I (EA36:46 & -BL) = BEPI 

Note: In 32-bit mode, EAo;3i are treated as zeros. 

A match occurs if the following expression is true on a 32-bit imple- 
mentation: 

EAo:3 I I (EA4:i4 & -nBL) = BEPI 

Bits in EA corresponding to Is in BL, concatenated with the 17 bits of 
EA to the right of BL, form the offset within the BAT area. 

The value in BL must be one of those given in the table above, and the 
values in BEPI and BRPN must have at least as many low-order Os as 
there are Is in BL. If these rules are violated, the results are undefined. 
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BAT Storage Protection 

If an effective address is determined to be within a BAT area that is vaHd 
for the access, the access is next validated by the storage protection 
scheme described in Section 4.10.2, "BAT Protection," on page 438. If 
this protection mechanism rejects the EA, a page fault (Data Storage 
interrupt or Instruction Storage interrupt) is generated. 



BAT Real Address 

If the protection mechanism accepts the access, then a real address is 
formed as shown in Figure 76 for 64-bit implementations, and in 
Figure 77 on page 429 for 32-bit implementations. 



EA 



BL 



"36 ■ 



-11 



-17- 



AND 



11 



-17- 



BRPN 



-36- 



RA 



-11 



OR 



361 



-17- 



Figure 76. Formation of real address via BAT, 64-bit implementations 



Access to the real memory of the BAT area is made according to the 
storage mode defined by the "WIMG" bits in the lower BAT register. 
These bits apply to the entire BAT area rather than to an individual page. 
See Section 4.8.2, "Supported Storage Modes," on page 431 for an expla- 
nation of these bits. 
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-11- 



-i7n 



EA 



BL 



t t 

AND 



11 



'171 



4- 



-11- 



BRPN 



t t 

OR 



RA 



11- 



-17- 



Figure 77. Formation of real address via BAT, 32-bit implementations 

4.8 Storage Access Modes 

When address relocation is enabled and the effective address generated by 
a storage access is translated by the Segmented Address Translation 
mechanism or by the Block Address Translation mechanism, the access is 
performed under the control of the Page Table Entry or BAT entry used 
to translate the effective address. Each Page Table Entry or DBAT entry 
contains four mode control bits, W, I, M, and G, that specify the storage 
mode for all accesses translated by the entry. The IBAT entry contains the 
W, J, and M bits, but not the G bit. The W and I bits control how the pro- 
cessor executing the access uses its own cache. The M bit specifies 
whether the processor executing the access must use the storage coher- 
ence protocol to ensure that all copies of the addressed storage location 
are made consistent. The G bit controls whether speculative data and 
instruction fetching is permitted. For an access translated by an IBAT 
entry, G is assumed to be 0. 

The mode control bits only have meaning when an effective address is 
translated in the processor performing a storage access. When an access is 
performed for which coherence is required, the processor performing the 
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access must inform the coherence mechanism that the access requires 
memory coherence. Other processors affected by the access must respond 
to the coherence mechanism. However, since these mode control bits are 
only relevant when an effective address is translated and have no direct 
relation to data in the cache, processors responding to the coherence 
request are able to respond without knowledge of the state of these bits. 

4.8.1 W, I, M, and G bits 

The W, I, M, and G bits in a Page Table Entry or DBAT entry, or the W, I, 
and M bits in an IBAT entry, control the way in which the processor 
accesses cache and main storage. Each bit controls a separate aspect of 
storage references. 

W Write Through 

If the data are in the cache, a store must update that copy of the 
data. In addition, if W=l the update must be written to the home 
storage location (see below). 

Store combining optimizations are allowed except when the store 
instructions are separated by sync or eieio. The architecture pre- 
sumes that data present in the cache are valid and a store may 
cause any part of that data to be copied back to main storage. 

The definition of the home storage location is dependent upon the 
implementation of the memory system but can be illustrated by 
the following examples: 

■ RAM Storage 

The store must be sent to the RAM controller to be written 
into the target RAM. 

■ I/O Adapter Card 

The store must be sent to the adapter card to be written to the 
target register or storage location. 

In systems with multilevel caching, the store must be written to a 
depth in the memory hierarchy that is seen by all processors and 
devices. 

I Caching Inhibited 

If 1=1. the storage access is completed by referencing the location 
in main storage, bypassing the cache. During the access, the 
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accessed location is not brought into the cache nor is the location 
allocated in the cache. 

Load/store combining optimizations are allowed except when the 
accesses are separated by sync^ or by eieio when the storage access 
is also Guarded. 

M Memory Coherence 

This mode control is provided to allow improved performance in 
systems in which accesses to storage kept consistent by hardware 
are slower than accesses to storage not kept consistent by hard- 
ware, and in which software is able to enforce the required consis- 
tency. When the mode is off (M=0), the hardware need not 
enforce data coherence. When the mode is on (M=l), the hard- 
ware must enforce data coherence. Because instruction storage 
need not be consistent with data storage, it is permissible for an 
implementation to ignore the M bit for instruction fetches. 

G Guarded Storage 

If G=l5 accesses to storage must conform to the restrictions 
described in Section 4.2.5, "Speculative Execution," on page 396. 



4.8.2 Supported Storage Modes 

The combinations of the Write Through bit, the Caching Inhibited bit, 
and the Memory Coherence bit define eight different storage modes. Six 
of these modes are supported. For each, the G bit may be 0 or 1. 

■ WIM = 000 

1. Data may be cached. 

2. Loads or stores for which the target location is in the cache may use 
that copy of the location. 

3. Exclusive ownership of the block containing the target location is 
not required for store accesses, and consistency operations for the 
block may be ignored when fetching the block, storing it back, or 
changing its state from shared to exclusive. 

■ WIM = 001 

1 - Data may be cached. 
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2. Loads or stores for which the target location is in the cache may use 
that copy of the location. 

3. Exclusive ownership of the block containing the target location is 
required before store accesses are allowed. When fetching the 
block, the processor must indicate that consistency is to be enforced 
on the bus transaction. If the state of the block is read shared, the 
processor must gain exclusive use of the block before storing into it. 

■ WIM = 010 

Caching is inhibited. The storage access goes to storage bypassing the 
cache. Hardware-enforced storage consistency is not required. 

■ WIM = 011 

Caching is inhibited. The storage access goes to storage bypassing the 
cache. Storage consistency is enforced by hardware. 

■ WIM = 100 

1 . Data may be cached. 

2. Loads for which the target location is in the cache may use that 
copy of the location. 

3. Stores must be written to main storage. The target location of the 
store may be cached and must be updated if there. 

4- Exclusive ownership of the block containing the target location is 
not required for store accesses, and consistency operations for the 
block may be ignored when fetching the block, storing it back, or 
changing its state from shared to exclusive. 

■ WIM = 101 

1 . Data may be cached. 

2. Loads for which the target location is in the cache may use that 
copy of the location. 

3. Stores must be written to main storage. The target location of the 
store may be cached and must be updated if there. 

4- Exclusive ownership of the block containing the target location is 
required before store accesses are allowed. When fetching the 
block, the processor must indicate that consistency is to be enforced 
on the bus transaction. If the state of the block is read shared, the 
processor must gain exclusive use of the block before storing into it. 
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■ WIM = 110 

This mode would represent memory that is Write Through, Caching 
Inhibited, and Memory Coherence Not Required. This mode is not 
supported. 

■ WIM = 111 

This mode would represent memory that is Write Through, Caching 
Inhibited, and Memory Coherence Required. This mode is not sup- 
ported. 

4.8.3 Mismatched WIMG Bits 

Accesses to the same storage location using two effective addresses for 
which the Write Through mode (W bit) differs must meet the memory 
coherence requirements described in Book II, Section 1.5, "Memory 
Coherence," on page 323. 

Accesses to the same storage location using two effective addresses for 
which the Caching mode (I bit) differs must meet the requirement that a 
copy of the target location of an access to Caching Inhibited storage not 
be in the cache. Violation of this requirement is considered a program- 
ming error; software must ensure that the location has not previously 
been brought into the cache or, if it has, that it has been flushed from the 
cache. If the programming error occurs, the result of the access is bound- 
edly undefined. 

Accesses to the same storage location using two effective addresses for 
which the Memory Coherence mode (M bit) or the Guarded mode (G bit) 
differ are always permitted. 

4.9 Reference and Change Recording 

If address translation is enabled (MSRir=1 or MSRdr=1), Reference (R) 
and Change (C) bits are maintained in the Page Table Entry for each real 
page for accesses due to Segment and Page Table address translation. Ref- 
erence and change recording is not performed for translations due to BAT 
or for direct-store (T=l) segments. 

The R and C bits are set automatically by hardware or by software 
assist in conjunction with normal Page Table processing as follows: 

Reference Bit 

As a result of Page Table processing for a storage access (load, 
store, cache instruction, or instruction fetch), the Reference bit 



Book III PowerPC Operating Environment Architecture 



434 



Chapter 4 Storage Control 



may be set to 1 immediately, or its setting may be delayed until the 
storage access is determined to be successful. 

The Reference bit may be set for a speculatively executed access. 
The Reference bit may also be set for accesses that are not per- 
formed when the access is prohibited by page protection, or if the 
access is the result of a string operation of zero length, or if the 
access is a Store Conditional but no store is performed because a 
reservation does not exist. 

Change Bit 

Whenever a data store is executed successfully, as part of the TLB 
lookup procedure the Change bit in the TLB is checked. If it is 
already set to 1, no further action is taken. If the TLB Change bit 
is 0, it is set to 1 and the corresponding Change bit in the Page 
Table Entry is set to 1. 

The PowerPC Architecture requires that the Change bit be set to 1 
only if the store is allowed by storage protection and all branches 
prior to the store that will cause the Change bit to be set have 
been resolved and it has been determined that the store is on the 
path that is to be executed. 

Furthermore, the Change bit may be set even when a store is not 
performed successfully in the following cases: 

1. A Store Conditional (stwcx. or stdcx.) is executed and is 
allowed by the storage protection mechanism, but no store is 
performed because a reservation does not exist. 

2. A Store String Word Indexed (stswx) is executed and is 
allowed by the storage protection mechanism, but no store is 
performed because the length is zero. 

3. The store operation is not performed because the instruction 
stream is interrupted before the store is performed. 

Execution of either of the Data Cache Block Touch instructions {debt, 
dcbtst) may result in setting the R bit for a page. Neither instruction may 
result in setting the C bit for a page. 

Figure 78 on page 435 summarizes the rules for setting the Reference 
and Change bits. The table applies to each atomic storage reference. It 
should be read from the top down; the first line matching a given situa- 
tion applies. For example, if stwcx, fails due to both a storage protection 
violation and the lack of a reservation, the Change bit must not 
be altered. 
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Status of access 


R 


C 


Storage protection violation 


Acc^ 


No 


Speculative I-fetch or load-type instruction 






Beyond an unresolved branch 


Acc 


No 


Beyond a possible interrupt 


Acc 


No 


Speculative store-type instruction 






Beyond an unresolved branch 


Acc 


No 


Beyond a possible interrupt 


Acc 


Acc^ 


Zero-length load {Iswx) 


Acc 


No 


Zero-length store {stswx) 


Acc 


Acc^ 


Store Conditional fails due to lack of a reservation 


Acc 


Acc^ 


Other non-speculative access: 






I-fetch 


Yes^ 


No 


Ordinary load or eciwx 


Yes 


No 


Ordinary store, ecowx, or dcbz 


Yes 


Yes 


icbi, debt, dcbtst, dcbst, dcbf 


Acc^ 


No 


debt 


Acc^ 


Accl'2 



"Acc" means that it is acceptable to set the R bit, or that it is acceptable to set the C bit if a 
store to the location would not violate storage protection. 
Ht is preferable not to set the bit. 
^If C is set, R must also be set. 

^This includes the case in which the instruction was speculatively fetched and R was not set 
tol. 



Figure 78. Setting tlie Reference and Change bits 

In the figure, the "load- type" instructions are the load instructions 
described in Book I, eciwx, and the Cache Management instructions that 
are permitted to be treated as a load with respect to address translation. 
The "store-type" instructions are the store instructions described in Book 
I, ecotvx, and the Cache Management instructions that are treated as a 
store with respect to address translation. The "ordinary" load and store 
instructions are those described in Book I. 
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When the hardware does an impHed load from the Page Table Entry 
due to a TLB miss, or updates the Reference and Change bits in the Page 
Table Entry, the accesses are done in real mode, so there are no Reference 
or Change bits to update. If software refers to a Page Table Entry when 
MSRdr=1, then Reference and Change bits in the associated Page Table 
Entries are set as for ordinary loads and stores. See Section 4.12, "Table 
Update Synchronization Requirements," on page 446 for the rules soft- 
ware must follow when updating the Reference and Change bits in the 
Page Table. 



Programming Note 

If it is important that the 
program that loads from 
the PTE retrieve the 
correct R and C bits, a 
sync instruction must be 
executed between a load, 
store, or instruction 
fetch that indirectly sets 
an R or C bit and the load 
of these bits from the 
PTE. 

Programming Note 

On systems with 
Translation Lookaside 
Buffers, the Reference 
and Change bits are set 
only on the basis of TLB 
activity. When software 
resets these bits to zero it 
must synchronize the 
TLB's actions by 
invalidating the TLB 
entries associated with 
the pages whose 
Reference and Change 
bits were reset. 



4.9.1 Synchronization of Reference and 
Change Bit Updates 

If processor A executes a load or store that causes a Reference bit and/or 
Change bit update, the following conditions must be met with respect to 
setting the bits and performing the access: 

1 - If processor A subsequently executes a sync^ both the updates to the 
bits and the access must be performed with respect to all other proces- 
sors and mechanisms before the sync completes on processor A. 

2. If processor B subsequently executes a tlbie that invalidates the TLB 
entry in processor A that was used to translate the access, and proces- 
sor B then executes a tlbsync that is broadcast, both the updates to the 
bits and the access must be performed with respect to all other proces- 
sors and mechanisms before the tlbsync completes on processor A. 

Updates to the Reference and Change bits may not be immediately vis- 
ible to the program after executing a load, store, or instruction fetch that 
sets them indirectly. 



4.10 Storage Protection 

The storage protection mechanism provides a means for selectively grant- 
ing read access, granting read/write access, and prohibiting access to 
areas of storage based on a number of control criteria. 

Since the protection mechanism operates as part of the address trans- 
lation mechanism, storage protection applies to translated accesses only. 
Instruction storage access protection is active only when MSRir=1. Data 
storage access protection is active only when MSRr)p^=l. 

Protection domains are defined only when the appropriate relocate bit 
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in the MSR (IR or DR) is 1. A protection domain is a page within an 
ordinary segment, an area of storage defined by a BAT entry, or a direct- 
store segment. A protection boundary is a boundary between protection 
domains. 

For ordinary translated accesses to memory via the Page Table, the 
page protection mechanism described in the next section is active. Differ- 
ent mechanisms are used for Block Address Translation (BAT) accesses 
(see Section 4.10.2, "BAT Protection," on page 438) and for direct-store 
segments (see Section 4.6.2, "Direct-Store Segment Protection," on 
page 422). 

4.10.1 Page Protection 

The page protection mechanism provides protection at the granularity of 
a page (4 KB). It is controlled by the following inputs: 

■ MSRpR, which distinguishes between supervisor state and problem 
state 

■ Kg and Kp, supervisor and problem key bits in the Segment Table 
Entry or Segment Register 

■ PP bits in the Page Table Entry 

A reference made via the segmented address translation mechanism is 
associated with a Segment Table Entry (STE) or Segment Register and 
with a Page Table Entry (PTE) by the address translation mechanism. The 
K bits, the PP bits, and the MSRpR bit are used as follows: 

A Key value is developed according to the following formula: 

Key ^ (Kp «& MSRpr) | (Kg & ^MSRrr) 

Using the generated Key, the table in Figure 79 on page 438 is applied. 
When a reference is not permitted because of the protection mecha- 
nism one of the following occurs: 

■ A Data Storage interrupt is generated and bit 4 of the DSISR is set 
tol. 

■ An Instruction Storage interrupt is generated and bit 36 {4} of SRRl is 
set to 1. 
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Key 


PP 


Page Type 


Load Access Permitted 


Store Access Permitted 


0 


00 


read/write 


yes 


yes 


0 


01 


read/write 


yes 


yes 


0 


10 


read/write 


yes 


yes 


0 


11 


read only 


yes 


no 


1 


00 


no access 


no 


no 


1 


01 


read only 


yes 


no 


1 


10 


read/write 


yes 


yes 


1 


11 


read only 


yes 


no 



Key Key selected by state of MSRpR bit 
PP PTE page protection bits 



Figure 79. Protection Key processing 

4.10.2 BAT Protection 

The BAT protection mechanism operates on an entire BAT area, not on 
individual pages. If an effective address is determined to be within a BAT 
area that is vahd for the access, the operations described above in Section 
4.10.1, "Page Protection," on page 437 are performed, with these excep- 
tions: 

■ For BATs, no Key value is defined; Figure 79 is used with an assumed 
Key=l. 

■ The PP bits from the lower BAT register are used, not bits from a Page 
Table Entry. 

4.11 Storage Control Instructions 
4.11.1 Cache Management Instructions 

This section contains the only privileged cache management instruction 
and additional specifications for the other cache management instructions 
described in Book II, Section 3.2, "Cache Management Instructions," on 
page 344. 
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If the effective address references a direct-store segment, the instruc- 
tion is treated as a no-op. 

When data relocate is off, MSRdr=0, the Data Cache Block set to 
Zero instruction estabHshes a block in the cache and may not verify that 
the real address is valid. If a block is created for an invalid real address, a 
Machine Check may result when an attempt is made to write that block 
back to storage. The block could be written back as the result of the exe- 
cution of an instruction that causes a cache miss and the invalid address 
block is the target for replacement, or as the result of a Data Cache Block 
Store instruction. 



Data Cache Block Invalidate X-form 

dcbi RA,RB 
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Let the effective address (EA) be the sum (RAIO)+(RB). 

The action taken is dependent on the storage mode associated with the 
target and on the state of the block. The list below describes the action to 
take if the block containing the byte addressed by EA is or is not in the 
cache. 

1 . Coherence Not Required 

Unmodified Block 

Invalidate the block in the local cache. 

Modified Block 

Invalidate the block in the local cache. (Discard the modified con- 
tents.) 

Absent Block 

No action is taken. 

2. Coherence Required 

Unmodified Block 

Invalidate copies of the block in the caches of all processors. 

Modified Block 

Invalidate copies of the block in the caches of all processors. (Dis- 
card the modified contents.) 

Absent Block 

If copies of the block are in the caches of any other processor, cause 
the copies to be invalidated. (Discard any modified contents.) 
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When data address translation is enabled, MSRdr=1, and the virtual 
address has no translation, a Data Storage Interrupt occurs. See Section 
5.5.3, "Data Storage Interrupt," on page 460. 

The function of this instruction is independent of the Write Through 
Required/Not Required and Caching Inhibited/Allowed modes of the 
block containing the byte addressed by EA. 

This instruction is treated as a store to the addressed byte with respect 
to address translation and storage protection, except that the Change bit 
need not be set, and if the Change bit is not set then the Reference bit 
need not be set. 

This instruction is privileged. 

Special Registers Altered 

None 



4.11.2 Segment Register Manipulation 
instructions 



Programming Note 

For a discussion of 
software syclironization 
requirements when 
altering Segment 
Registers, please refer to 
Chapter 7, 
"Synchronization 
Requirements for Special 
Registers and for 
Lookaside Buffers," on 
page 483. 



Move To Segment Register X-form 

mtsr SR,RS 



31 


RS 


1 


SR 


III 


210 


/ 


0 


6 


11 


12 


16 


21 


31 



SEGREG(SR) «- (RS) 

The contents of register RS are placed into Segment Register SR. 
This instruction is privileged. 

This instruction is defined only for 32-bit implementations. Using it on 
a 64-bit implementation will cause an Illegal Instruction type Program 
interrupt. 

Special Registers Altered 

None 
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Move To Segment Register Indirect X-form 

mtsrin RS,RB 
[Power mnemonic: mtsri] 
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SEGREG((RB)o:3) <- (RS) 

The contents of register RS are copied to the Segment Register selected 
by bits 0:3 of register RB. 

This instruction is privileged. 

This instruction is defined only for 32-bit implementations. Using it on 
a 64-bit implementation will cause an Illegal Instruction type Program 
interrupt. 

Special Registers Altered 

None 



Programming Note 

The RA field is not 
defined for the mtsrin 
and mfsrin instructions 
in this architecture. 
However, mtsrin and 
mfsrin will perform the 
same function in 
PowerPC as do mtsri and 
mfsri in POWER if RA is 0 
in the POWER 
instructions. 



Move From Segment Register X-form 

mfsr RT,SR 
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RT <- SEGREG(SR) 

The contents of Segment Register SR are placed into register RT. 
This instruction is privileged. 

This instruction is defined only for 32-bit implementations. Using it on 
a 64-bit implementation will cause an Illegal Instruction type Program 
interrupt. 

Special Registers Altered 

None 



Book III PowerPC Operating Environment Architecture 



442 



Chapter 4 Storage Control 



Move From Segment Register Indirect X-form 

mfsrin RT,RB 
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RT <r- SEGREG((RB)o:3) 

The contents of the Segment Register selected by bits 0:3 of register 
RB are copied into register RT. 
This instruction is privileged. 

This instruction is defined only for 32-bit implementations. Using it on 
a 64-bit implementation will cause an Illegal Instruction type Program 
interrupt. 

Special Registers Altered 

None 

4.11.3 Lookaside Buffer Management 
Instructions (Optional) 

While the PowerPC Architecture describes logically separate instruction 
fetch and fixed-point (including effective address computation) execution 
units, the programming model is that there is one translation mechanism 
and, for 32-bit implementations, one set of Segment Registers. 

For performance reasons, most implementations will implement a Seg- 
ment Lookaside Buffer (SLB) (64-bit implementations) and a Translation 
Lookaside Buffer (TLB). These are caches of portions of the Segment 
Table and Page Table, respectively. As changes are made to the address 
translation tables, it is necessary to force the SLB and TLB into line with 
the updated tables. This is done by invalidating SLB and TLB entries, or 
occasionally by invalidating the entire SLB or TLB, and allowing the 
translation caching mechanism to refetch from the tables. 

Each PowerPC implementation that has an SLB must provide means 
for doing the following: 

■ invalidating an individual SLB entry 

■ invalidating the entire SLB 

Each PowerPC implementation that has a TLB must provide means 
for doing the following: 
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■ invalidating an individual TLB entry 

■ invalidating the entire TLB 

An implementation may choose to provide one or more of the instruc- 
tions listed in this section in order to satisfy requirements in the preceding 
list. If an instruction is implemented that matches the semantics of an 
instruction described here, the implementation should be as specified 
here. Alternatively, an algorithm may be given that performs one of the 
functions listed above (a loop invalidating individual SLB entries may be 
used to invalidate the entire SLB, for example), or instructions with dif- 
ferent semantics may be implemented. Such algorithms or instructions 
must be described in Book IV, PowerPC Implementation Features, 

It is permissible for an instruction described here to be implemented so 
that more is done than absolutely required. For example, an instruction 
whose semantics are to purge an SLB entry may be implemented so as to 
purge an entire congruence class or perhaps even the entire SLB. Such 
additional actions should be described in Book IV. 

If a 64-bit implementation does not implement an SLB, it treats the 
corresponding instructions [slbie and slbia) either as no-ops or as illegal 
instructions. Similarly, if any implementation does not implement a TLB, 
it treats the corresponding instructions {tlbie, tibia, and tlbsync) either as 
no-ops or as illegal instructions. 

SLB Invalidate Entry X-form 

slbie RB 
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EA (RB) 

if SLB entry exists for EA then 
SLB entry <r- inval id 

Let the effective address (EA) be the contents of register RB. If the Seg- 
ment Lookaside Buffer (SLB) contains an entry corresponding to EA, that 
entry is made invalid (i.e., removed from the SLB). 

The SLB search is done regardless of the settings of MSRir and 
MSRdr. 

Block Address Translation for EA, if any, is ignored. 
This instruction is privileged. 

This instruction is optional in the PowerPC Architecture. 

This instruction is defined only for 64-bit implementations. Using it on 



Programming Note 

Because the presence, 
absence, and exact 
semantics of the various 
Lookaside Buffer 
Management 
instructions are model- 
dependent, it is 
recommended that 
system software 
"encapsulate" uses of 
such instructions into 
subroutines to minimize 
the impact of moving 
from one 

implementation to 
another. 

Programming Note 

For a discussion of 
software 
synchronization 
requirements when 
invalidating SLB and TLB 
entries, please refer to 
Chapter 7, 
"Synchronization 
Requirements for Special 
Registers and for 
Lookaside Buffers," on 
page 483. 

Programming Note 

It is not necessary that 
the ASR point to a valid 
Segment Table when 
issuing slbie. 
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a 32-bit implementation will cause an Illegal Instruction type Program 
interrupt. 

Special Registers Altered 

None 



Programming Note 

It is not necessary that 
the ASR point to a valid 
Segment Table when 
issuing sibia. 



SLB Invalidate All X-form 
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All SLB entries <- invalid 

The entire SLB is made invalid (i.e., all entries are removed). 
The SLB is invaUdated regardless of the settings of MSRjr and 
MSRdr. 

This instruction is privileged. 

This instruction is optional in the Pov^erPC Architecture. 

This instruction is defined only for 64-bit implementations. Using it on 
a 32-bit implementation will cause an Illegal Instruction type Program 
interrupt. 

Special Registers Altered 

None 



TLB Invalidate Entry X-form 

tlbie RB 
[Power mnemonic: tlbi] 
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VPI ^ (RB)36:51 {4:19} 

Identify TLB entries corresponding to VPI 
Each such TLB entry <- invalid 

Let the effective address (EA) be the contents of register RB. If the 
Translation Lookaside Buffer (TLB) contains an entry corresponding to 
EA, that entry is made invalid (i.e., removed from the TLB). 

The TLB search is done regardless of the settings of MSRto and 
MSRdr. The search is done based on a portion of the Virtual Page Index, 
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including the least significant bits, without reference to the SLB, Segment 
Table, or Segment Register. All entries matching the search criteria are 
invalidated. 

Block Address Translation for EA, if any, is ignored. 
This instruction is privileged. 

This instruction is optional in the PowerPC Architecture. 

See Section 4.12, "Table Update Synchronization Requirements," on 
page 446 for a description of other requirements associated with the use 
of this instruction. 



Programming Note 

Nothing is guaranteed 
about instruction 
fetching in other 
processors if tibie deletes 
the TLB entry for the 
page in which some 
other processor is 
currently executing. 



Special Registers Altered 

None 



TLB Invalidate All X-form 
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All TLB entries <- invalid 

The entire TLB is invalidated (i.e., all entries are removed). 
The TLB is invalidated regardless of the settings of MSRjr 
MSRdr. 

This instruction is privileged. 

This instruction is optional in the PowerPC Architecture. 

Special Registers Altered 

None 



and 



Programming Notes 

It is not necessary that 
the ASR point to a valid 
Segment Table or that 
SDR 1 point to a valid 
Page Table when issuing 
tibia. 

Nothing is guaranteed 
about instruction 
fetching in other 
processors if tibia deletes 
the TLB entry for the 
page in which some 
other processor is 
currently executing. 



TLB Synchronize X-form 
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The tlbsync instruction does not complete until all previous tlbie and 
tibia instructions executed by the processor executing this instruction 
have been received and completed by all other processors. 

This instruction is privileged. 

This instruction is optional in the PowerPC Architecture, but it must 
be implemented if any of the following are true: 



Book /// PowerPC Operating Environment Architecture 



446 



Chapter 4 Storage Control 



■ A TLB invalidation instruction that broadcasts is implemented. 

■ The eciwx or ecowx instructions are implemented. 

See Section 4.12, "Table Update Synchronization Requirements," on 
page 446 for a description of other requirements associated with the use 
of this instruction. 

Special Registers Altered 

None 

4.12 Table Update Synchronization 
Requirements 

This section describes the steps that software must take when updating 
the tables involved in address translation. Updates to these tables include: 

■ Adding a new Page Table Entry (PTE) 

■ Modifying an existing PTE, including the special case of modifying the 
PTE's Reference bit 

■ Deleting a PTE 

■ Adding a new Segment Table Entry (STE) 

■ Modifying an existing STE 

■ Deleting an STE 

In a multiprocessor system it is critical that these rules be followed to 
ensure that all processors see a consistent set of tables. Even in a unipro- 
cessor system certain rules must be followed, notably those regarding 
Reference and Change bit updates, because software changes must be 
synchronized with automatic updates by the hardware. 

A sync instruction ensures that all prior tlbie instructions executed by 
the processor executing the sync instruction have completed on that pro- 
cessor. 

To ensure that a tlbie instruction executed by one processor has com- 
pleted on all other processors, the sequence tlbie followed by sync is not 
sufficient. This sequence must be followed by a tlbsync instruction and 
then a sync instruction on the processor that executed the tlbie to ensure 
that: 
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1 - the prior tlbie instructions have completed on other processors, and 
2. the tlbsync has completed on the processor executing this sequence. 

When tlbie is executed on one processor, software must ensure that the 
following sequence of instructions is executed on that processor before a 
tlbie is executed on a second processor: 

1- sync 

2. tlbsync 

3. sync 

Other instructions may be interleaved with this sequence of instruc- 
tions, but these instructions must appear in the order shown. 

The code sequences shown in Sections 4.12.1 and 4.12.2 assume that a 
context synchronizing operation has occurred before the sequence is exe- 
cuted (e.g., that the sequence is executed within the Data Storage inter- 
rupt handler). 

Page Table Entries and Segment Table Entries must not be changed in 
a manner that causes an implicit branch. 

4.12.1 Page Table Updates 

HTAB entries must be locked on multiprocessors. Access to HTAB 
entries must be appropriately synchronized by software locking of (i.e., 
guaranteeing exclusive access to) entries or groups of entries if more than 
one processor can modify the table at once. 

On uniprocessors, HTAB entries need not be locked. To adapt the 
examples given below for the uniprocessor case, simply delete the 
"lockO" and "unlock()" lines. The sync instructions shown are still 
required even on uniprocessors. 

TLBs are noncoherent caches of the HTAB. TLB entries must be 
flushed explicitly with one of the TLB Invalidate instructions. The sync 
instruction waits until all prior TLB invalidates by this processor are 
complete. This may cost a sync per HTAB entry update. 

Unsynchronized lookups in the HTAB continue even while it is being 
modified. Any processor, including the processor modifying the HTAB, 
may look in the HTAB at any time in an attempt to reload a TLB entry. 
An inconsistent HTAB entry must never accidentally become visible, thus 
there must be synchronization between modifications to the Valid bit and 
any other modifications. This costs as many as two syncs per HTAB entry 
update. 
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Processors write Reference and Change bits with unsynchronized 
atomic byte stores. This requires that the V, R, and C bits be in distinct 
bytes. It also requires extreme care to ensure that no store overwrites one 
of these bytes accidentally. 

In the examples below, 

■ "lockO" and "unlockO" refer to software locks for exclusive access to 
the table entry in question, 

■ sync refers to the sync instruction, 

■ tlbsync refers to the tlbsync instruction, and 

■ tlbie refers to the tlbie instruction. 

Adding a Page Table Entry 

This is the simplest Page Table case. It requires no synchronization with 
the hardware, just a lock on the PTE in a multiprocessor system. We fill 
in the entries in the PTE except for the Valid bit, issue a sync to ensure 
that the updates have all made it to storage, and turn on the Valid bit. 

lock(PTE) 

PTEvsiD.H.APi ^ i^sw values 
PTErpn,r.c.wimg,pp ^ new values 
sync 

PTEv <r- 1 

unlock(PTE) 

Modifying a Page Table Entry 
General case 

In this case a currently valid PTE must be changed. To do this we must 
lock the PTE, mark it invalid, flush it from the TLB, update the informa- 
tion in the PTE, mark it valid again, and unlock, using sync at appropri- 
ate times to wait for modifications to complete. 

lock(PTE) 
PTEv <- 0 
sync 

tlbie (PTE) 

sync 

tlbsync 

sync 

PTEvsiD.H.APi <- new values 
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PTErpn,r,c,wimg.pp <- new values 
sync 

PTEv <- 1 
unlock(PTE) 

Resetting the Reference bit 

In the case where the PTE is modified only to set the Reference bit to 0, a 
much simpler algorithm suffices because the Reference bit need not be 
maintained exactly. 

lock(PTE) 

oldR <r- PTEr 

if OldR = 1 then 

PTEr ^ 0 

tlbie(PTE) 
unlock(PTE) 

Since only the R and C bits are modified by hardware, and since R and 
C are in different bytes, the R bit can be set to 0 by reading the current 
contents of the byte in the PTE containing R (bits 48:55 of the second 
doubleword on 64-bit implementations, bits 16:23 of the second word on 
32-bit implementations), ANDing the value with OxFE, and storing the 
byte back into the PTE. 

iViodifying the virtual address 

If the virtual address is being changed to a different address within the 
same TLB hash class, it suffices to: 

lock(PTE) 

val <r- PTEvsiD.API.H.V 

insert new VSID into val 

PTEvsiD.APi,H,v ^ val 
sync 

tlbie(PTE) 
sync 
tlbsync 
sync 

unlock(PTE) 

Here we take advantage of the fact that the store into the first double- 
word (word, on 32-bit systems) of the PTE is performed atomically. 
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Deleting a Page Table Entry 

Here we just lock the entry, mark it invalid, wait for the change to com- 
plete, and unlock. 

lock(PTE) 
PTEv ^ 0 
sync 

tlbie(PTE) 
sync 
tlbsync 
sync 

unlock(PTE) 

4.12.2 Segment Table Updates 

These updates are similar to Page Table updates, but without the compli- 
cation of hardware updates to Reference and Change bits. 

STAB entries must be locked on multiprocessors. Access to STAB 
entries must be appropriately synchronized by software locking of (i.e., 
guaranteeing exclusive access to) entries or groups of entries if more than 
one processor can modify the table at once. 

On uniprocessors, STAB entries need not be locked. To adapt the 
examples given below for the uniprocessor case, simply delete the 
"lockO" and "unlock()" lines. The sync instructions shown are still 
required even on uniprocessors. 

SLBs are noncoherent caches of the STAB. SLB entries must be flushed 
explicitly with one of the SLB Invalidate instructions. The sync instruc- 
tion waits until all prior SLB invalidates by this processor are complete. 
This may cost a sync per STAB entry update. 

Unsynchronized lookups in the STAB continue even while it is being 
modified. Any processor, including the processor modifying the STAB, 
may look in the STAB at any time in an attempt to reload an SLB entry. 
An inconsistent STAB entry must never accidentally become visible, thus 
there must be synchronization between modifications to the Valid bit and 
any other modifications. This costs as many as two syncs per STAB entry 
update. 

In the examples below, 

■ "lockO" and "unlock()" refer to software locks for exclusive access to 
the table entry in question, 

■ sync refers to the sync instruction, and 

■ sibie refers to the slbie instruction. 

Book III PowerPC Operating Environment Architecture 



4.12 Table Update Synchronization Requirements 



451 



Adding a Segment Table Entry 

We fill in the entries in the STE except for the VaUd bit, issue a sync to 
ensure that the updates have all made it to storage, and turn on the Valid 
bit. 

lock(STE) 

STEesid.t.Ks.Kp <- new values 
if T = 0 

then STEvsiD ^ new value 

el se STEjo <- new val ue 
sync 

STEv <r- 1 
unlock(STE) 



Modifying a Segment Table Entry 

In this case a currently valid STE must be changed. To do this v^e must 
lock the STE, mark it invalid, flush it from the SLB, update the informa- 
tion in the STE, mark it valid again, and unlock, using sync at appropri- 
ate times to wait for modifications to complete. 

lock(STE) 
STEv 0 
sync 

slbie(STE) 
sync 

STEesid.t.ks.Kp ^ new values 
if T = 0 

then STEvsiD new value 

el se STEio <- new val ue 

sync 

STEv ^ 1 
unlock(STE) 



Deleting a Segment Table Entry 

Here we just lock the entry, mark it invalid, wait for the change to com- 
plete, and unlock. 

lock(STE) 
STEv <r- 0 
sync 

slbie(STE) 
sync 

unlock(STE) 
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4.12.3 Segment Register Updates 

On an implementation that provides Segment Registers rather than a Seg- 
ment Table, there is no table to be locked but there are certain synchroni- 
zation requirements that must be satisfied when using the Move To 
Segment Register instructions. See Chapter 7, "Synchronization Require- 
ments for Special Registers and for Lookaside Buffers," on page 483. 
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5.1 Overview 

The PowerPC architecture provides an interrupt mechanism to allow the 
processor to change state as a result of external signals, errors, or unusual 
conditions arising in the execution of instructions. 

System Reset and Machine Check interrupts are not ordered. All 
other interrupts are ordered such that only one interrupt is reported, and 
when it is processed (taken) no program state is lost. Since save/restore 
registers SRRO and SRRl are serially reusable resources used by most 
interrupts, program state may be lost when an unordered interrupt is 
taken. 

5.2 Interrupt Synchronization 

When an interrupt occurs, SRRO is set to point to an instruction such that 
all preceding instructions have completed execution, no subsequent 
instruction has begun execution, and the instruction addressed by SRRO 
may or may not have completed execution, depending on the interrupt 
type. 

With the exception of System Reset and Machine Check interrupts, all 
interrupts are context synchronizing as defined in Section 1.7.1, "Context 
Synchronization," on page 371. System Reset and Machine Check inter- 
rupts are context synchronizing if they are recoverable (i.e., if bit 62 {30} 
of SRRl is set to 1 by the interrupt). If a System Reset or Machine Check 
interrupt is not recoverable (i.e., if bit 62 {30} of SRRl is set to 0 by the 
interrupt), it acts like a context synchronizing operation with respect to 
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subsequent instructions. That is, a non-recoverable System Reset or 
Machine Check interrupt need not satisfy items 1 through 3 of Section 
1.7.1, but does satisfy items 4 and 5. 

5.3 Interrupt Classes 

Interrupts are classified by whether they are directly caused by the execu- 
tion of an instruction or are caused by some other system exception. 
Those that are "system-caused" are: 

■ System Reset 

■ Machine Check 

■ External 

■ Decrementer 

External and Decrementer are maskable interrupts. While MSR£e=0, 
the interrupt mechanism ignores the exceptions that generate these inter- 
rupts. Therefore, software may delay the generation of these interrupts 
by setting MSRee=0 or by failing to set MSR££=1 after processing an 
interrupt. When any interrupt is taken, MSRgg is set to 0 by the interrupt 
mechanism, delaying the recognition of any further exceptions causing 
these interrupts. 

System Reset and Machine Check exceptions are not maskable. These 
exceptions will be recognized regardless of the setting of the MSR. 

"Instruction-caused" interrupts are further divided into two classes, 
precise and imprecise, 

5.3.1 Precise interrupt 

Except for the Imprecise Mode Floating-Point Enabled Exception inter- 
rupt, all instruction-caused interrupts are precise. When the fetching or 
execution of an instruction causes a precise interrupt, the following con- 
ditions exist at the interrupt point: 

1 . SRRO addresses either the instruction causing the exception or the 
immediately following instruction. Which instruction is addressed can 
be determined from the interrupt type and status bits. 

2. An interrupt is generated such that all instructions preceding the 
instruction causing the exception appear to have completed with 



Book III PowerPC Operating Environment Architecture 



5.3 Interrupt Classes 



455 



respect to the executing processor. However, some storage accesses 
generated by these preceding instructions may not have been per- 
formed with respect to all other processors and mechanisms. 

3- The instruction causing the exception may appear not to have begun 
execution (except for causing the exception), may have partially com- 
pleted, or may have completed, depending on the interrupt type. 

4- Architecturally, no subsequent instruction has begun execution. 

5.3.2 Imprecise Interrupt 

This architecture defines one imprecise interrupt, the Imprecise Mode 
Floating-Point Enabled Exception interrupt. 

When the execution of an instruction causes an imprecise interrupt, 
the following conditions exist at the interrupt point; 

1 . SRRO addresses either the instruction causing the exception or some 
instruction following the instruction causing the exception that gener- 
ated the interrupt. 

2. An interrupt is generated such that all instructions preceding the 
instruction addressed by SRRO appear to have completed with respect 
to the executing processor. 

3. If the imprecise interrupt is forced by the context synchronizing mech- 
anism, due to an instruction that causes another interrupt (e.g., Align- 
ment, Data Storage), then SRRO addresses the interrupt-forcing 
instruction, and the interrupt-forcing instruction may have been par- 
tially executed (see Section 5.6, "Partially Executed Instructions," on 
page 472). 

4. If the imprecise interrupt is forced by the execution synchronizing 
mechanism, due to executing an execution synchronizing instruction 
other than sync or isync^ then SRRO addresses the interrupt-forcing 
instruction, and the interrupt-forcing instruction appears not to have 
begun execution (except for its forcing the imprecise interrupt). If the 
imprecise interrupt is forced by a sync or isync instruction, then SRRO 
may address either the sync or isync instruction or the following 
instruction. 

5- If the imprecise interrupt is not forced by either the context synchro- 
nizing mechanism or the execution synchronizing mechanism, then the 
instruction addressed by SRRO appears not to have begun execution, if 
it is not the excepting instruction. 
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6. No instruction following the instruction addressed by SRRO appears 
to have begun execution. 

All Floating-Point Enabled Exception interrupts are maskable using 
the MSR bits FEO and FEl. Although these interrupts are maskable, they 
differ significantly from the other maskable interrupts in that the masking 
of these interrupts is usually controlled by the application program, 
whereas the masking of External and Decrementer interrupts is con- 
trolled by the operating system. 



5.4 Interrupt Processing 



Programming Note 

In some 

implementations, every 
instruction fetch when 
MSR|R=1, and every 
instruction execution 
requiring address 
translation when 
MSRdr=1, may have the 
side effect of modifying 
SRRO and SRR1. For 
further details, see the 
Book IV, PowerPC 
Implementation Features 
document for the 
implementation. 



Associated with each kind of interrupt is an interrupt vector, which con- 
tains the initial sequence of instructions that is executed when the corre- 
sponding interrupt occurs. 

Interrupt processing consists of saving a small part of the processor's 
state in certain registers, identifying the cause of the interrupt in another 
register, and continuing execution at the corresponding interrupt vector 
location. When an exception exists that will cause an interrupt to be gen- 
erated and it has been determined that the interrupt can be taken, the fol- 
lowing actions are performed: 

1 . SRRO is loaded with an instruction address that depends on the type 
of interrupt; see the specific interrupt description for details. 

2. Bits 33:36 and 42:47 {1:4 and 10:15} of SRRl are loaded with infor- 
mation specific to the interrupt type. 

3. Bits 0:32, 37:41, and 48:63 {0, 5:9, and 16:31} of SRRl are loaded 

with a copy of the corresponding bits of the MSR, except for the 
Machine Check interrupt, for which these bits are set to implementa- 
tion-dependent values. 

4- The MSR is set as described in Figure 80 on page 458. The new values 
take effect beginning with the first instruction following the interrupt. 
MSR bits of particular interest are: 

■ MSRjR and MSRdr are set to 0 for all interrupt types. Thus relo- 
cate is turned off for both instruction fetch and data access begin- 
ning with the first instruction following the acceptance of the 
interrupt. See Chapter 4, "Storage Control," on page 391. 

B MSRgp is set to 1 in 64-bit implementations and execution after the 
interrupt begins in 64-bit mode. This bit does not exist in 32-bit 
implementations. 
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5. Instruction fetch and execution resumes, using the new MSR value, at 
a location specific to the interrupt type. The location is determined by 
adding the interrupt vector's offset (see Figure 81 on page 459) to the 
base address determined by MSRjp (see Interrupt Prefix on page 377). 
For a Machine Check that occurs when MSR]^£=0, the Checkstop 
state is entered (the machine stops executing instructions). See Section 
5.5.2, "Machine Check Interrupt," on page 459. 

Interrupts do not clear reservations obtained with Iwarx or Idarx, 
The operating system should do so at appropriate points, such as at pro- 
cess switch. 



5.5 Interrupt Definitions 

Figure 80 on page 458 shows all the types of interrupts and the values 
assigned to the MSR for each. Figure 81 on page 459 shows the offset of 
the interrupt vector for each interrupt type. 



5.5.1 System Reset Interrupt 

System Reset begins with a System Reset interrupt. 

If the System Reset exception caused the processor state to be cor- 
rupted such that the contents of SRRO or SRRl are not valid or other 
processor resources are corrupt and would preclude reliable resumption 
of program execution, then the processor sets SRRl bit 62 {30} (where 
MSRri is normally placed) to 0, to indicate to the interrupt handler that 
the interrupt is not recoverable. 

The following registers are set: 

SRRO Set to the effective address of the instruction that the processor 
would have attempted to execute next if no interrupt conditions 
were present. 

SRRl 

33:36 {1:4} Set to 0. 
42:47 {10:15} Set to 0. 

62 {30} Loaded from bit 62 {30} of the MSR if the processor is 

in a recoverable state, otherwise set to 0. 

Others Loaded from the MSR. 

MSR See Figure 80 on page 458. 



Programming Note 

In general, at process 
switch, due to possible 
process interlocks and 
possible data availability 
requirements, the 
operating system needs 
to consider executing 
the following: 

■ stwcx., to clear the 
reservation if one is 
outstanding, to ensure 
that a Iwarx or Idarx in 
the "old" process is not 
paired with a stwcx. or 
stdcx. in the "new" 
process. 

■ sync, to ensure that all 
storage operations of 
an interrupted process 
are complete with 
respect to other 
processors before that 
process begins 
executing on another 
processor. 

■ isync or rfi, to ensure 
that the instructions in 
the "new" process 
execute in the "new" 
context. 

Programming Note 

In order to handle 
Machine Check and 
System Reset interrupts 
correctly, the operating 
system should manage 
MSRri as follows. 

■ In the Machine Check 
and System Reset 
interrupt handlers, 
interpret SRRl bit 62 
{30} (where MSRri is 
placed) as: 

—0: interrupt is not 
recoverable 
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— 1: interrupt is 
recoverable 

■ In each interrupt 
handler, when enough 
state has been saved 
that a Machine Check 
or System Reset 
interrupt can be 
recovered from, set 
MSRri to 1. 

■ In each interrupt 
handler, do the 
following (in order) 
just before returning: 

1. Set MSRri to 0. 

2. Set SRRO and SRR1 to 
the values to be used 
by rfi. The new value of 
SRR1 should have bit 
62 {30} set to 1 (which 
will happen naturally if 
SRR1 is restored to the 
value saved there by 
the interrupt, because 
the interrupt handler 
will not be executing 
this sequence unless 
the interrupt is 
recoverable). 

3. Execute rfi. 

MSRri can be managed 
similarly to handle 
interrupts other than 
Machine Check and 
System Reset that occur 
within interrupt 
handlers. 

This Note describes only 
the management of 
MSRri- It is not intended 
to be a full description of 
the requirements for an 
interrupt handler. 



Interrupt Type 


MSR bit 


IP 


ILE 


LE ME 


SF{) 


System Reset 


— 


— 


(1) 


1 


Machine Check 


— 


— 


(1) 0 


1 


Data Storage 








(1) 


1 


Instruction Storage 








(1) 


1 


External 


— 


— 


(1) 


1 


Ahgnment 








(1) 


1 


Program 


— 


— 


(1) 


1 


FP Unavailable 










Decrementer 










System Call 










Trace 










Floating-Point Assist 











0 bit is set to 0 

1 bit is set to 1 

— bit is not altered 

(1) bit is copied from ILE 

Defined bits not shown above (BE, DR, EE, FEO, FEl, FP, IR, POW, PR, RI, and SE) are set 
to 0. 

Reserved bits are set as if written as 0. 



Figure 80 MSR setting due to interrupt 

Execution resumes at offset 0x00100 from the base real address indi- 
cated by MSRjp. 

Each implementation provides a means for software to distinguish 
power-on Reset from other types of System Reset, and describes it in the 
Book IV, PowerPC Implementation Features document for the imple- 
mentation. 
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Offset (hex) 


Interrupt Type 


00000 


Reserved 


00100 


System Reset 


00200 


Machine Check 


00300 


Data Storage 


00400 


Instruction Storage 


00500 


External 


00600 


AHgnment 


00700 


Program 


00800 


Floating-Point Unavailable 


00900 


Decrementer 


OOAOO 


Reserved 


OOBOO 


Reserved 


OOCOO 


System Call 


OODOO 


Trace 


OOEOO 


Floating-Point Assist 


OOEIO 


Reserved 


OOFFF 


Reserved 


01000 


Reserved, implementation-specific 


02FFF 


(end of interrupt vector locations) 



Programming Note 

Use of any of the 
locations shown as 
reserved in Figure 81 risks 
incompatibility with 
future implementations. 



Figure 81 . Offset of interrupt vector by interrupt type 

5.5.2 Machine Check Interrupt 

Machine Check interrupts are enabled when MSR]y[£=l. If MSRjy[£=0 
and a Machine Check occurs, the processor enters the Checkstop state. 
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Programming Note 

On some 

implementations a 
Machine Check interrupt 
may occur due to 
referencing an invalid 
(nonexistent) real 
address, either directly 
(with MSRdr=0) or 
through an invalid 
translation. On such a 
system, execution of Data 
Cache Block set to Zero 
can cause a delayed 
Machine Check interrupt 
by Introducing a block 
into the data cache that 
is associated with an 
invalid real address. A 
Machine Check interrupt 
could eventually occur 
when and if a subsequent 
attempt is made to store 
that block to main 
storage. 



Disabled Machine Check (Checkstop State) 

When a processor is in Checkstop state, instruction processing is sus- 
pended and generally cannot be restarted without resetting the processor. 
Some implementations may preserve some or all of the internal state of 
the processor when entering Checkstop state, so that the state can be ana- 
lyzed as an aid in problem determination. 

Enabled Machine Check 

If the Machine Check exception caused the processor state to be cor- 
rupted such that the contents of SRRO or SRRl are not valid or other 
processor resources are corrupt and would preclude reliable resumption 
of program execution, then the processor sets SRRl bit 62 {30} (where 
MSRri is normally placed) to 0, to indicate to the interrupt handler that 
the interrupt is not recoverable. 

In some systems, the operating system may attempt to identify and log 
the cause of the Machine Check. If the exception that caused the 
Machine Check does not preclude continued execution (i.e., if SRRl bit 
62 {30} is set to 1 for the interrupt handler), the processor must be able to 
continue execution at the Machine Check interrupt vector address. 

The following registers are set: 

SRRO Set on a "best effort" basis to the effective address of some instruc- 
tion that was executing or was about to be executed when the Ma- 
chine Check exception occurred. For further details see the Book 
IV, PowerPC Implementation Features document for the imple- 
mentation. 



SRRl 
62 {30} 

Others 



Loaded from bit 62 {30} of the MSR if the processor is in a 
recoverable state, otherwise set to 0. 

See the Book IV, PowerPC Implementation Features docu- 
ment for the implementation. 



MSR See Figure 80 on page 458. 

Execution resumes at offset 0x00200 from the base real address indi- 
cated by MSRjp. 



5.5.3 Data Storage Interrupt 

A Data Storage interrupt occurs when no higher priority exception exists 
and a data storage access cannot be performed for any of the following 
reasons: 
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■ The instruction results in a Direct-Store Error exception. 

■ The effective address of a Load, Store, icbi, dcbz, dcbst, dcbf, debt, 
eciwx, or ecowx instruction cannot be translated. 

■ The instruction is not supported for the type of storage addressed. 

— Iwarx, Idarx, stwcx,, or stdcx, to a location that is Write Through 
Required (if the interrupt does not occur then the instruction exe- 
cutes correctly: see Book II, Section 1.8.2, "Atomic Update Primi- 
tives," on page 336). 

— Iwarx, Idarx, stwcx,, stdcx,, eciwx, or ecowx to a direct-store seg- 
ment (if the interrupt does not occur then the results are boundedly 
undefined: see Section 4.6.3, "Instructions Not Supported for 
T=l," on page 422). 

■ The access violates storage protection. 

■ Execution of an eciwx or ecowx instruction is disallowed because 
EARe=0. 

Such accesses can be generated by load/store type instructions (dis- 
cussed in Book I, Power? C User Instruction Set Architecture)^ certain 
storage control instructions, certain cache control instructions (discussed 
in Book II, PowerPC Virtual Environment Architecture)^ and the eciwx 
and ecowx instructions. 

If a stwcx, or stdcx, has an effective address for which a normal Store 
would cause a Data Storage interrupt, but the processor does not have 
the reservation from Iwarx or Idarx^ then it is implementation-dependent 
whether a Data Storage interrupt occurs. 

If a Move Assist instruction has a length of zero (in the XER), a Data 
Storage interrupt does not occur, regardless of the effective address. 

The following registers are set: 

SRRO Set to the effective address of the instruction that caused the inter- 
rupt. 

SRRl 

33:36(1:4} Set to 0. 
42:47(10:15} Set to 0. 
Others Loaded from the MSR. 

MSR See Figure 80 on page 458. 
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DSISR 

0 Set to 1 if a load or store instruction results in a Direct-Store 
Error exception, otherwise 0. 

1 Set to 1 if the translation of an attempted access is not found in 
the hashed primary HTEG, or in the rehashed secondary 
HTEG, or in the range of a DBAT register; otherwise 0. 

2:3 Set to 0. 

4 Set to 1 if a storage access is not permitted by the page or DBAT 
protection mechanism described on page 436, otherwise 0. 

5 Set to 1 if the access was due to an eciwx, ecowXy Iwarx, Idarx, 
stivcx., or stdcx, that addresses a direct-store segment (T=l in 
Segment Register or Segment Table Entry), or if the access was 
due to a Iwarx, Idarx, stwcx., or stdcx. that addresses Write 
Through storage; set to 0 otherwise. 

6 Set to 1 for a store operation and to 0 for a load operation. 
7:8 Set to 0. 

9 Reserved for DABR (see the Book IV, PowerPC Implementa- 
tion Features document for the implementation). 

10 Set to 1 if the Segment Table search fails to find a translation 
for the effective address, otherwise set to 0. 

1 1 Set to 1 if execution of an eciwx or ecowx instruction was at- 
tempted when EARe=0, otherwise set to 0. 

12:31 Set to 0. 

DAR Set to the effective address of a storage element as described in the 
following list. 

■ A byte in the first word accessed in the page that caused the 
Data Storage interrupt, for a byte, halfword, or word access to 
an ordinary segment. 

■ A byte in the first doubleword accessed in the page that caused 
the Data Storage interrupt, for a doubleword access to an ordi- 
nary segment. 

■ A byte in the first word accessed in the BAT area that caused 
the Data Storage interrupt, for a byte, halfword, or word 
access to a BAT area. 

■ A byte in the first doubleword accessed in the BAT area that 
caused the Data Storage interrupt, for a doubleword access to 
a BAT area. 

■ A byte in the block that caused the Data Storage interrupt, for 
icbi, dcbzy dcbst, dcbf, or dcbL 

■ Any effective address in the range of storage being addressed. 
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for a Direct-Store Error exception. 

■ The effective address computed by the instruction, for 
attempted execution of eciwx or ecowx when EARe=0. 

If the interrupt occurs in 32-bit mode on a 64-bit implementation, 
the high-order 32 bits of the DAR are set to 0. 
Execution resumes at offset 0x00300 from the base real address indi- 
cated by MSRip. 

5.5.4 Instruction Storage Interrupt 

An Instruction Storage interrupt occurs when no higher priority excep- 
tion exists and an attempt to fetch the next instruction to be executed 
cannot be performed for any of the following reasons: 

■ The effective address cannot be translated. 

■ The fetch access is to a direct-store segment. 

■ The fetch access violates storage protection. 

Such accesses can only be generated by instruction fetches. 
The following registers are set: 

SRRO Set to the effective address of the instruction that the processor 
would have attempted to execute next if no interrupt conditions 
were present (if the interrupt occurs on attempting to fetch a 
branch target, SRRO is set to the branch target address). 

SRRl 

33 {1} Set to 1 if the translation of an attempted access is not found 

in the hashed primary HTEG, or in the rehashed secondary 
HTEG, or in the range of an IBAT register; otherwise 0. 

34 {2} Set to 0. 

35 {3} Set to 1 if the fetch access was to a direct-store segment (T=l 

in Segment Register or Segment Table Entry); set to 0 other- 
wise. 

36 {4} Set to 1 if a storage access is not permitted by the page or 

IBAT protection mechanism described on page 436, other- 
wise 0. 

42 {10} Set to 1 if the Segment Table search fails to find a translation 

for the effective address, otherwise set to 0. 
43:47 {11:15} Set to 0. 
Others Loaded from the MSR. 
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MSR See Figure 80 on page 458. 

Execution resumes at offset 0x00400 from the base real address indi- 
cated by MSRip. 

5.5.5 External Interrupt 

An External interrupt occurs when no higher priority exception exists, an 
External interrupt exception is presented to the interrupt mechanism, and 
MSR££=1. The occurrence of the interrupt does not cancel the request. 
The following registers are set: 

SRRO Set to the effective address of the instruction that the processor 
would have attempted to execute next if no interrupt conditions 
were present. 

SRRl 

33:36 {1:4} Set to 0. 
42:47 {10:15} Set to 0. 
Others Loaded from the MSR. 

MSR See Figure 80 on page 458. 

Execution resumes at offset 0x00500 from the base real address indi- 
cated by MSRip. 

5.5.6 Alignment Interrupt 

An Alignment interrupt occurs when no higher priority exception exists 
and the implementation cannot perform a storage access for one of the 
reasons listed below. 

■ The operand of a floating-point load or store is not word-aligned. 

■ The operand of a fixed-point doubleword load or store is not word- 
aligned. 

■ The operand of ImWy stmw, Iwarx, Idarx, stwcx., stdcx,, ecitvx, or 
ecowx is not aligned. 

■ The operand of a single-register load or store is not aligned and the 
processor is in Little-Endian mode. 

S The instruction is Imtv, stmw, Iswi, Iswx, stswu or stswx and the pro- 
cessor is in Little-Endian mode. 
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■ The operand of a floating-point load or store is in a direct-store seg- 
ment (T=l). 

■ The operand of an elementary or string load or store crosses a protec- 
tion boundary. 

■ The operand of Imtv or stmw crosses a segment or BAT boundary. 

■ The operand of dcbz is in storage that is Write Through Required or 
Caching Inhibited, or dcbz is executed in an implementation that has 
either no data cache or a Write Through data cache. 

For ImtVy stmw, Iswi, Iswx, stswi, and stswx in Little-Endian mode, an 
Alignment interrupt always occurs. For Imtv and stmw with an operand 
that is not aligned in Big-Endian mode, and for Iwarx, Idarx, stwcx,, 
stdcx,, eciwx, and ecowx with an operand that is not aligned in either 
Endian mode, an implementation may yield boundedly undefined results 
instead of causing an Alignment interrupt (for eciwx and ecowx when 
EAR£=0, a third alternative is to cause a Data Storage interrupt). For all 
other cases listed above, an implementation may execute the instruction 
correctly instead of causing an Alignment interrupt. (For dcbz^ "correct" 
execution means setting each byte of the block in main storage to 0x00.) 

The following registers are set: 

SRRO Set to the effective address of the instruction that caused the inter- 
rupt. 



SRRl 

33:36 {1:4} 
42:47 {10:15} 
Others 



Set to 0. 

Set to 0, 

Loaded from the MSR. 



MSR See Figure 80 on page 458. 



DSISR 
0:11 
12:13 



14 

15:16 



17 



Set to 0. 

Set to bits 30:31 of the instruction if DS-form. 
Set to ObOO if D- or X-form. (Set to ObOO on 32-bit imple- 
mentations.) 
Set to 0. 

Set to bits 29:30 of the instruction if X-form. 

Set to ObOO if D- or DS-form. 

Set to bit 25 of the instruction if X-form. 

Set to bit 5 of the instruction if D- or DS-form. 



Programming Note 

The architecture does not 
support the use of an 
unaligned effective 
address by Iwarx, Idarx, 
stwcx., stdcx., eciwx, 
and ecowx. If an 
Alignment interrupt 
occurs because one of 
these instructions 
specifies an unaligned 
effective address, the 
Alignment interrupt 
handler must not 
attempt to simulate the 
instruction, but instead 
should treat the 
instruction as a 
programming error. 
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18:21 Set to bits 21;24 of the instruction if X-form. 

Set to bits 1;4 of the instruction if D- or DS-form. 

22:26 Set to bits 6:10 of the instruction (RT/RS/FRT/FRS), except 
undefined for dcbz. 

27:31 Set to bits 11:15 of the instruction (RA) for update form in- 
structions; set to either bits 11:15 of the instruction or to any 
register number not in the range of registers to be loaded for 
a vahd form Imw, Iswi, or Isivx-, otherwise undefined. 

DAR Set to the effective address of the data access as computed by the 
instruction causing the Alignment exception, except that if the in- 
terrupt occurs in 32-bit mode on a 64-bit implementation, the 
high-order 32 bits of the DAR are set to 0. 

For an X-form Load or Store, it is acceptable to set the DSISR to the 
same value that w^ould have resulted if the corresponding D- or DS-form 
instruction had caused the interrupt. Similarly, for a D- or DS-form Load 
or Store, it is acceptable to set the DSISR to the value that w^ould have 
resulted for the corresponding X-form instruction. For example, an 
unaligned Iwax (that crosses a protection boundary) w^ould normally, fol- 
lowing the description above, cause the DSISR to be set to binary: 

000000000000 00 0 01 0 0101 ttttt ????? 

where "ttttt" denotes the RT field, and ".^????" denotes undefined bits. 
However, it is acceptable if it causes the DSISR to be set as for Iwa, which 
is 

000000000000 10 0 00 0 1101 ttttt ????? 

If there is no corresponding alternative form instruction (e.g., for Iwaux), 
the value described above must be set in the DSISR. 

The instruction pairs that may use the same DSISR value are: 

Ibz/lbzx Ibzu/lbzux Ihz/lhzx Ihzu/lhzux 

lha/lhax lhau/lhaux Iwz/lwzx Iwzu/lwzux 

Iwa/lwax Id/ldx Idu/ldux 

stb/stbx stbu/stbux sth/sthx sthu/sthux 

stw/stwx stwu/stwux std/stdx stdu/stdux 

Ifs/lfsx Ifsu/lfsux Ifd/lfdx Ifdu/lfdux 

stfs/stfsx stfsu/stfsux stfd/stfdx stfdu/stfdux 

Execution resumes at offset 0x00600 from the base real address indi- 
cated by MSRjp. 
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5.5.7 Program Interrupt 

A Program interrupt occurs when no higher priority exception exists and 
one of the following exceptions arises during execution of an instruction: 

Floating-Point Enabled Exception 

A Floating-Point Enabled Exception type Program interrupt is 
generated when the expression 

(MSRpEO I MSRpEi) & FPSCRpEX 

is 1 . FPSCRpEx is turned on by the execution of a floating-point 
instruction that causes an enabled exception or by the execution 
of a Move To FPSCR instruction that results in both an exception 
bit and its corresponding enable bit being 1. 

Illegal Instruction 

An Illegal Instruction type Program interrupt is generated when 
execution is attempted of an illegal instruction, or of a reserved or 
optional instruction that is not provided by the implementation. 

An Illegal Instruction type Program interrupt may be generated 
when execution is attempted of any of the following kinds of 
instruction. If the interrupt is not generated, the alternative is 
shown in parentheses. 

■ an instruction that is in invalid form (boundedly undefined 
results) 

■ an kwx instruction for which RA or RB is in the range of reg- 
isters to be loaded (boundedly undefined results) 

■ an mtspr or mfspr instruction with an SPR field that does not 
contain one of the defined values: 

— MSRpR=l and spro=l 

(Privileged Instruction type Program interrupt) 

— MSRpR=0 or spro=0 
(boundedly undefined results) 

■ an unimplemented floating-point instruction that is not 
optional (Floating-Point Assist interrupt) 



Book III PowerPC Opera ting Environment Architecture 



468 



Chapter 5 Interrupts 



Privileged Instruction 

The following applies when MSRpR=l. 

A Privileged Instruction type Program interrupt is generated when 
execution is attempted of a privileged instruction, or of an mtspr 
or mfspr instruction with an SPR field that contains one of the 
defined values having spro=l. It may be generated when execu- 
tion is attempted of an mtspr or mfspr instruction with an SPR 
field that does not contain one of the defined values but has 
spro=l; in this case an Illegal Instruction type Program interrupt 
may be generated instead. 

Trap 

A Trap type Program interrupt is generated when any of the con- 
ditions specified in a Trap instruction is met. 

The following registers are set: 

SRRO For all Program interrupts except a Floating-Point Enabled Excep- 
tion when in one of the Imprecise modes, set to the effective ad- 
dress of the instruction that caused the Program interrupt. 

For an Imprecise Mode Floating-Point Enabled Exception, set to 
the effective address of the excepting instruction or to the effective 
address of some subsequent instruction. If it points to a subse- 
quent instruction, that instruction has not been executed. If a sub- 
sequent instruction is Synchronize {sync) or Instruction 
Synchronize (isync)^ SRRO will not point more than four bytes 
beyond the sync or isync instruction. 

If FPSCRp£x=l but Floating-Point Enabled Exception interrupts 
are disabled by having both MSRpgQ and MSRpp^ = 0, a Floating- 
Point Enabled Exception interrupt will occur prior to or at the 
next synchronizing event if these MSR bits are altered by any 
instruction that can set the MSR so that the expression 

(MSRpEO I MSRpEi) & FPSCRpEX 

is 1. When this occurs, SRRO is loaded with the address of the 
instruction that would have executed next, not with the address of 
the instruction that modified the MSR causing the interrupt. 
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SRRl 

33:36 (1:4} Set to 0. 
42(10} Set to 0. 

43 (11} Set to 1 for a Floating-Point Enabled Exception type 

Program interrupt, otherwise 0. 

44 (12) Set to 1 for an Illegal Instruction type Program interrupt, 

otherwise 0. 

45 (13} Set to 1 for a Privileged Instruction type Program inter- 

rupt, otherwise 0. 

46 (14} Set to 1 for a Trap type Program interrupt, otherwise 0. 

47 (15} Set to 0 if SRRO contains the address of the instruction 

causing the exception, and to 1 if SRRO contains the ad- 
dress of a subsequent instruction. 
Others Loaded from the MSR. 

Only one of bits 43:46 (11:14} can be set to 1. 

MSR See Figure 80 on page 458. 

Execution resumes at offset 0x00700 from the base real address indi- 
cated by MSRjp . 

5.5.8 Floating-Point Unavailable Interrupt 

A Floating-Point Unavailable interrupt occurs when no higher priority 
exception exists, an attempt is made to execute a floating-point instruc- 
tion (including floating-point loads, stores, and moves), and MSRpp=0. 
The following registers are set: 

SRRO Set to the effective address of the instruction that caused the inter- 
rupt. 

SRRl 

33:36 (1:4} Set to 0. 
42:47(10:15} Set to 0. 
Others Loaded from the MSR. 

MSR See Figure 80 on page 458. 

Execution resumes at offset 0x00800 from the base real address indi- 
cated by MSRip. 
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5.5.9 Decrementer Interrupt 

A Decrementer interrupt occurs when no higher priority exception exists, 
the Decrementer exception exists, and MSR££=1. The occurrence of the 
interrupt cancels the request. 
The following registers are set: 

SRRO Set to the effective address of the instruction that the processor 
would have attempted to execute next if no interrupt conditions 
were present. 

SRRl 

33:36 (1:4} Set to 0. 
42:47 {10:15} Set to 0. 
Others Loaded from the MSR. 

MSR See Figure 80 on page 458. 

Execution resumes at offset 0x00900 from the base real address indi- 
cated by MSRip. 

5.5.10 System Call Interrupt 

A System Call interrupt occurs when a System Call instruction is exe- 
cuted. 

The following registers are set: 

SRRO Set to the effective address of the instruction following the System 
Call instruction. 

SRRl 

33:36 (1:4} Set to 0. 
42:47(10:15} Set to 0. 
Others Loaded from the MSR. 

MSR See Figure 80 on page 458. 

Execution resumes at offset OxOOCOO from the base real address indi- 
cated by MSRip. 

5.5.11 Trace Interrupt 

The Trace interrupt may optionally be implemented. 

If implemented, a Trace interrupt occurs when no higher priority 
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exception exists and either MSR5£=1 and any instruction except rfi is 
successfully completed, or MSRg£=l and a Branch instruction is com- 
pleted. Successful completion means that the instruction caused no other 
interrupt. Thus a Trace interrupt never occurs for a System Call instruc- 
tion, nor for a Trap instruction that traps. 
The following registers are set: 

SRRO Set to the effective address of the instruction that the processor 
would have attempted to execute next if no interrupt conditions 
were present. 

SRRl 

33:36 and 42:47 (1:4 and 10:15} See the Book IV, PowerPC Imple- 
mentation Features document for the implementation. 
Others Loaded from the MSR. 

MSR See Figure 80 on page 458. 

For further details see the Book IV, PowerPC Implementation Fea- 
tures document for the implementation. 

Execution resumes at offset OxOODOO from the base real address indi- 
cated by MSRip. 

5.5.12 Floating-Point Assist interrupt 

The Floating-Point Assist interrupt may optionally be implemented. Its 
purpose is to allow software assistance for the following cases. 

■ Implemented floating-point instructions that need software assistance 
in order to complete certain operations such those involving denor- 
malized numbers. 

■ Unimplemented floating-point instructions that are not optional. 

It is permissible for the processor to generate an Illegal Instruction 
type Program interrupt instead of a Floating-Point Assist interrupt in 
this case. 

The following registers are set: 

SRRO Set to the effective address of the instruction that caused the Float- 
ing-Point Assist interrupt. 

SRRl 

33:36 and 42:47 {1:4 and 10:15} See the Book IV, PowerPC Imple- 
mentation Features document for the implementation. 
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Others Loaded from the MSR. 
MSR See Figure 80 on page 458. 

For further details see the Book IV, PowerPC Implementation Fea- 
tures document for the implementation. 

Execution resumes at offset OxOOEOO from the base real address indi- 
cated by MSRjp. 

5.6 Partially Executed Instructions 

The architecture permits certain instructions to be partially executed 
when an Alignment or Data Storage interrupt occurs, or when an impre- 
cise interrupt is forced by an instruction that causes an Alignment or 
Data Storage exception. These instructions are: 

1 . Load Multiple or Load String that causes an Alignment or Data Stor- 
age interrupt: some registers in the range of registers to be loaded may 
have been loaded. 

2. Store Multiple or Store String that causes an Alignment or Data Stor- 
age interrupt: some bytes of storage in the range addressed may have 
been updated. 

3. An elementary (non-multiple and non-string) store that causes an 
Alignment or Data Storage interrupt: some bytes just before the 
boundary may have been updated. If the instruction normally alters 
CRO (sttvcx., stdcx.), CRO is set to an undefined value. For update 
forms, the update register (RA) is not altered. 

4. A floating-point load that causes an Alignment or Data Storage inter- 
rupt: the target register (FRT) may be altered. For update forms, the 
update register (RA) is not altered. 

5. A load or store to a direct-store segment that causes a Data Storage 
interrupt due to a Direct-Store Error exception: some of the associated 
address/data transfers may not have been initiated. All initiated trans- 
fers are completed before the exception is reported, and the non- 
initiated transfers are aborted. Thus the instruction completes before 
the Data Storage interrupt occurs. 

In the cases above, the questions of how many registers and how much 
storage is altered are implementation-, instruction-, and boundary-depen- 
dent. Hov/ever, storage protection is not violated. Furthermore, if some 
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of the data accessed is in direct-store (T=l) and the instruction is not sup- 
ported for direct-store, the locations in direct-store are not accessed. 

In the following situation, partial execution is not allowed (this pre- 
serves restartability): 

An elementary (non-multiple and non-string) fixed-point load that 
causes an Alignment or Data Storage interrupt: the target register 
(RT) is not altered. For update forms, the update register (RA) is 
not altered. 

5.7 Exception Ordering 

Since multiple exceptions can exist at the same time and the architecture 
does not provide for reporting more than one interrupt at a time, the gen- 
eration of more than one interrupt is prohibited. Also, some exceptions 
would be lost if they were not recognized and handled when they 
occurred. For example, if an External interrupt was generated when a 
Data Storage exception existed, the Data Storage exception would be 
lost. If the Data Storage exception was caused by a Store Multiple 
instruction that spanned a page boundary and the exception was a result 
of attempting to access the second page, the store could have modified 
locations in the first page even though it appeared that the Store Multiple 
instruction was never executed. 

In addition, the architecture defines imprecise interrupts that must be 
recoverable, cannot be lost, and can occur at any time with respect to the 
executing instruction stream. Some of the maskable and nonmaskable 
exceptions are persistent and can be deferred. The following exceptions 
persist even though some other interrupt is generated: 

■ Floating-Point Enabled Exceptions 

■ External 

■ Decrementer 

For the above reasons, all exceptions are prioritized with respect to 
other exceptions that may exist at the same instant to prevent the loss of 
any exception that is not persistent. Some exceptions cannot exist at the 
same instant as some others. 
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5.7.1 Unordered Interrupt Conditions 

The exceptions listed here are unordered, meaning that they may occur at 
any time regardless of the state of the interrupt mechanism. These excep- 
tions must be recognized and processed when presented. 

1 . System Reset 

2. Machine Check 

All other interrupts are ordered with respect to the interrupt mecha- 
nism resources. 

5.7.2 Ordered Exceptions 

The exceptions described here are ordered, meaning that only one can be 
reported. However, the single ordered exception that can be reported 
may exist in concert with unordered exceptions. Ordered exceptions may 
or may not be instruction-caused. The two lists identify the ordered 
interrupts by type. The order within the lists does not imply priority but 
only lists the possible exceptions that may be reported. 

System-caused or Imprecise 
1- Program 

— Imprecise Mode Floating-Point Enabled Exception 

2. External 

3. Decrementer 

Instruction-caused and Precise 

1 . Instruction Storage 

2. Program 

— Illegal Instruction 

— Privileged Instruction 

3. Function Dependent 

3.a Fixed-Point 
la Program 
— Trap 
lb System Call 
Ic.l Alignment 
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lc.2 Data Storage 

2 Trace (if implemented) 
3.h Floating-Point 

1 FP Unavailable 
2a Program 

— Precise Mode Floating-Point Enabled Exception 
2b Floating-Point Assist (if implemented) 
2c. 1 Alignment 
2c.2 Data Storage 

3 Trace (if implemented) 

For implementations that execute multiple instructions in parallel 
using pipeline or superscalar techniques, or combinations of these, it can 
be difficult to understand the ordering of exceptions. To understand this 
ordering it is useful to consider a model in which an instruction is 
fetched, decoded, and then executed. In this model, the exceptions a sin- 
gle instruction would generate are in the order shown in the list of 
instruction-caused exceptions. Exceptions with different numbers have 
different ordering. Exceptions with the same numbering but different 
lettering are mutually exclusive and cannot be caused by the same 
instruction. 

Even on processors that are capable of executing several instructions 
simultaneously, or out of order, instruction-caused interrupts (precise and 
imprecise) occur in program order. 

5.8 interrupt Priorities 

This section describes the relationship of nonmaskable, maskable, pre- 
cise, and imprecise interrupts. In the following descriptions, the interrupt 
mechanism waiting for all possible exceptions to be reported includes 
only exceptions caused by previously initiated instructions (e.g., it does 
not include waiting for the Decrementer to step through zero). The 
exceptions are listed in order of highest to lowest priority. 

1 . System Reset 

System Reset exception has the highest priority of all exceptions. If 
this exception exists, the interrupt mechanism ignores all other excep- 
tions and generates a System Reset interrupt. 

Once the System Reset interrupt is generated, no nonmaskable inter- 
rupts are generated due to exceptions caused by instructions issued 
prior to the generation of this interrupt. 
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2. Machine Check 

Machine Check exception is the second highest priority exception. If 
this exception exists and a System Reset exception does not exist, the 
interrupt mechanism ignores all other exceptions and generates a 
Machine Check interrupt. 

Once the Machine Check interrupt is generated, no nonmaskable 
interrupts are generated due to exceptions caused by instructions 
issued prior to the generation of this interrupt. 

3. Instruction Dependent 

This exception is the third highest priority exception. When this 
exception is created, the interrupt mechanism waits for all possible 
Imprecise exceptions to be reported. It then generates the appropriate 
ordered interrupt if no higher priority exception exists when the inter- 
rupt is to be generated. Within this category a particular instruction 
may present more than a single exception. When this occurs, those 
exceptions are ordered in priority as indicated in the following lists. 

A. Fixed-Point Loads and Stores 

a. Alignment 

b. Data Storage 

c. Trace (if implemented) 

B. Floating-Point Loads and Stores 

a. Floating-Point Unavailable 

b. Alignment 

c. Data Storage 

d. Trace (if implemented) 

C. Other Floating-Point Instructions 

a. Floating-Point Unavailable 

b. Program — Precise Mode Floating-Point Enabled Exception 

c. Floating-Point Assist (if implemented) 

d. Trace (if implemented) 

D. rfi and mtmsr 

a. Program — Privileged Instruction 

b. Program — Precise Mode Floating-Point Enabled Exception 

c. Trace (if implemented), for mtmsr only 

If the MSR bits FEO and FEl are set such that Precise Mode Float- 
ing-Point Enabled Exception interrupts are enabled and FPSCR bit 
FEX is set, a Program interrupt will result prior to or at the next 
synchronizing event. 
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E. Other exceptions 

These exceptions are mutually exclusive and have the same priority: 

■ Program — Trap 

■ System Call 

■ Program — Privileged Instruction 

■ Program — Illegal Instruction 

R Instruction Storage 

This exception has the lowest priority in this category. It is recog- 
nized only when all instructions prior to the instruction causing this 
exception appear to have completed and that instruction is to be 
executed. 

The priority of this interrupt is specified for completeness and to 
ensure that it is not given more favorable treatment. It is accept- 
able for an implementation to treat this interrupt as though it had a 
lower priority. 

4. Program — Imprecise Mode Floating-Point Enabled Exception 
This exception is the fourth highest priority exception. When this 
exception is created, the interrupt mechanism waits for all other possi- 
ble exceptions to be reported. It then generates this interrupt if no 
higher priority exception exists when the interrupt is to be generated. 

5. External 

This exception is the fifth highest priority exception. When this excep- 
tion is created, the interrupt mechanism waits for all other possible 
exceptions to be reported. It then generates this interrupt if no higher 
priority exception exists when the interrupt is to be generated. 

6. Decrementer 

This exception is the lowest priority exception. When this exception is 
created, the interrupt mechanism waits for all other possible excep- 
tions to be reported. It then generates this interrupt if no higher prior- 
ity exception exists when the interrupt is to be generated. 
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6.1 Overview 

The Time Base and the Decrementer provide timing functions for the sys- 
tem. A specific instruction is provided for reading the Time Base, while 
the Decrementer is manipulated as an SPR and the Time Base is written 
as an SPR. Both are volatile resources and must be initialized during star- 
tup. 

Time Base (TB) 

The Time Base provides a long-period counter driven by an imple- 
mentation-dependent frequency. 

Decrementer (DEC) 

The Decrementer, a counter that is updated at the same rate as the 
Time Base, provides a means of signaling an interrupt after a spec- 
ified amount of time has elapsed unless 

■ the Decrementer is altered by software in the interim, or 

■ the Time Base update frequency changes. 

6.2 Time Base 

The Time Base (TB) is a 64-bit register (see Figure 82, "Time Base," on 
page 480) containing a 64-bit unsigned integer that is incremented peri- 
odically. Each increment adds 1 to the low-order bit (bit 63). The fre- 
quency at which the integer is updated is implementation-dependent. 

There is no automatic initialization of the Time Base; system software 
must perform this initialization. 
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Programming Notes 

Assuming that the 
operating system 
initializes the Time Base 
on power-on to some 
reasonable value and 
that the update 
frequency of the Time 
Base is constant, the Time 
Base can be used as a 
source of values that 
increase at a constant 
rate, such as for time 
stamps in trace entries. 

Even if the update 
frequency is not constant, 
values read from the 
Time Base are 
monotonically increasing 
(except when the Time 
Base wraps from 2^^-1 to 
0). If a trace entry is 
recorded each time the 
update frequency 
changes, the sequence of 
Time Base values can be 
post-processed to 
become actual time 
values. 

On an implementation 
that performs 
speculative execution, 
the Time Base may be 
read arbitrarily far 
"ahead" of the point at 
which it appears in the 
instruction stream. If it is 
important that this not 
occur, a context 
synchronizing operation 
such as the isync 
instruction should be 
placed immediately 
before the instructions 
that read the Time Base. 

See Book II, Section 4.4, 
"Computing Time of Day 
from the Time Base," on 
page 354 for ways to 



TBU 



TBL 



32 



63 



Field 



Description 



TBU 
TBL 



Upper 32 bits of Time Base 
Lower 32 bits of Time Base 



Figure 82 Time Base 

The Time Base increments until its value becomes 
OxFFFF_FFFF_FFFF_FFFF (l^^-l). At the next increment, its value 
becomes OxOOOO_0000_0000_0000. There is no interrupt or other indi- 
cation when this occurs. 

The period of the Time Base depends on the driving frequency. As an 
order of magnitude example, suppose that the CPU clock is 100 MHz 
and that the Time Base is driven by this frequency divided by 32. Then 
the period of the Time Base would be 



100 MHz 



5.90 X lO^^seconds 



which is approximately 187,000 years. 

The Time Base must be implemented such that the following require- 
ments are satisfied. 

1 . Loading a GPR from the Time Base shall have no effect on the accu- 
racy of the Time Base. 

2. Storing a GPR to the Time Base shall replace the value in the Time 
Base with the value in the GPR. 

The PowerPC Architecture does not specify a relationship between the 
frequency at which the Time Base is updated and other frequencies, such 
as the CPU clock or bus clock in a PowerPC system. The Time Base 
update frequency is not required to be constant. What is required, so 
that system software can keep time of day and operate interval timers, is 
one of the following. 

■ The system provides an (implementation-dependent) interrupt to soft- 
ware whenever the update frequency of the Time Base changes, and a 
means to determine what the current update frequency is. 

■ The update frequency of the Tim.e Base is under the control of the sys- 
tem software. 
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6.2.1 Writing the Time Base 

Writing the Time Base is privileged. Reading the Time Base is not privi- 
leged; it is discussed in Book II, Chapter 4, "Time Base," on page 351. 

It is not possible to write the entire 64-bit Time Base using a single 
instruction. The mttbl and mttbu extended mnemonics write the lower 
and upper halves of the Time Base (TBL and TBU), respectively, preserv- 
ing the other half. These are extended mnemonics for the mtspr instruc- 
tion; see page 495. 

The Time Base can be written by a sequence such as: 



1 wz 


Rx , upper 


# load 64-b1 


t value for 


1 wz 


Ry , 1 ower 


# TB Into 


Rx and Ry 


11 


Rz.O 






mttbl 


Rz 


# force TBL 


to 0 


mttbu 


Rx 


# set TBU 




mttbl 


Ry 


# set TBL 





Loading 0 into TBL prevents the possibility of a carry from TBL to 
TBU while the Time Base is being initialized. 



compute time of day in 
POSIX format. 



Programming Note 

The instructions for 
writing the Time Base are 
implementation- and 
mode-independent. Thus 
code written to set the 
Time Base on a 32-bit 
implementation will 
work correctly on a 64-bit 
implementation running 
in either 64- or 32-bit 
mode. 



6.3 Decrementer 

The Decrementer (DEC) is a 32-bit decrementing counter that provides a 
mechanism for causing a Decrementer interrupt after a programmable 
delay. 



DEC 



31 



Figure 83 Decrementer 

The Decrementer is driven by the same frequency as the Time Base. 
The period of the Decrementer will depend on the driving frequency, but 
if the same values are used as given above for the Time Base (see Section 
6.2), and if the Time Base update frequency is constant, the period would 
be 

2^^ X ^2 ^ 

= 1.37x10 seconds 



™^ 100 MHz 
which is approximately 23 minutes. 
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Programming Note 

In systems that change 
the Time Base update 
frequency for purposes 
such as power 
management the 
Decrementer input 
frequency will also 
change. Software must 
be aware of this in order 
to set interval timers. 

On an implementation 
that performs 
speculative execution, 
the Decrementer may be 
read arbitrarily far 
"ahead" of the point at 
which it appears in the 
instruction stream. If it is 
important that this not 
occur, a context 
synchronizing operation 
such as the isync 
instruction should be 
placed immediately 
before the instruction 
that reads the 
Decrementer. 



Programming i^ote 

If the execution of the 
mtdec instruction causes 
bit 0 of the Decrementer 
to change from 0 to 1, an 
interrupt request is 
signaled. 



The Decrementer counts down, causing an interrupt (unless masked) 
when passing through zero. The Decrementer must be implemented such 
that the following requirements are satisfied. 

1. The operation of the Time Base and the Decrementer is coherent, i.e., 
the counters are driven by the same fundamental time base. 

2. Loading a GPR from the Decrementer shall have no effect on the accu- 
racy of the Decrementer. 

3. Storing a GPR to the Decrementer shall replace the value in the Decre- 
menter with the value in the GPR. 

4. Whenever bit 0 of the Decrementer changes from 0 to 1, an interrupt 
request is signaled. If multiple Decrementer Interrupt requests are 
received before the first can be reported, only one interrupt is reported. 
The occurrence of a Decrementer interrupt cancels the request. 

5. If the Decrementer is altered by software and the content of bit 0 is 
changed from 0 to 1, an interrupt request is signaled. 



6.3.1 Writing and Reading the Decrementer 

The content of the Decrementer can be read or written using the mfspr 
and mtspr instructions, both of which are privileged when they refer to 
the Decrementer. Using an extended mnemonic (see page 495), the Dec- 
rementer may be written from register GPR Rx using: 

mtdec Rx 

The Decrementer may be read into GPR Rx using; 

mfdec Rx 

Copying the Decrementer to a GPR has no effect on the Decrementer 
content or interrupt mechanism. 



/// PowerPC Operating Environment Architecture 



Synchronization 
Requirements for 
Special Registers and 
for Loolcaside Buffers 




Changing the value in certain system registers, and invaUdating SLB and 
TLB entries, can have the side effect of altering the context in which data 
addresses and instruction addresses are interpreted, and in which instruc- 
tions are executed. For example, changing MSRjr from 0 to 1 has the 
side effect of enabling translation of instruction addresses. These side 
effects need not occur in program order, and therefore may require 
explicit synchronization by software. (Program order is defined in Book 
II, Section 1.1, "Definitions and Notation," on page 319.) 

An instruction that alters the context in which data addresses or 
instruction addresses are interpreted, or in which instructions are exe- 
cuted, is called a "context-altering instruction." This chapter covers all 
the context-altering instructions. The software synchronization required 
for them is shown in Table 12 on page 485 (for data access) and Table 13 
on page 486 (for instruction fetch and execution). 

The notation "CSI" in the tables means any context synchronizing 
instruction (i.e., sc, isync, or rfi), A context synchronizing interrupt (i.e., 
any interrupt except non-recoverable System Reset or non-recoverable 
Machine Check) can be used instead of a context synchronizing instruc- 
tion. If it is, phrases like "the synchronizing instruction," below, should 
be interpreted as meaning the instruction at which the interrupt occurs. 
If no software synchronization is required before (after) a context- 
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Programming Note 

Sometimes advantage 
can be taken of the fact 
that certain instructions 
that occur naturally in 
the program, such as the 
rfi at the end of an 
interrupt handler, 
provide the required 
synchronization. 



altering instruction, "the synchronizing instruction before (after) the con- 
text-altering instruction" should be interpreted as meaning the context- 
altering instruction itself. 

The synchronizing instruction before the context-altering instruction 
ensures that all instructions up to and including that synchronizing 
instruction are fetched and executed in the context that existed before the 
alteration. The synchronizing instruction after the context-altering 
instruction ensures that all instructions after that synchronizing instruc- 
tion are fetched and executed in the context established by the alteration. 
Instructions after the first synchronizing instruction, up to and including 
the second synchronizing instruction, may be fetched or executed in 
either context. 

If a sequence of instructions contains context-altering instructions and 
contains no instructions that are affected by any of the context alter- 
ations, no software synchronization is required within the sequence. 

No software synchronization is required before altering the MSR 
(except perhaps when altering the POW or LE bits: see the tables), 
because mtmsr is execution synchronizing. No software synchronization 
is required before most of the other alterations shown in Table 13 on 
page 486, because all instructions before the context-altering instruction 
are fetched and decoded before the context-altering instruction is exe- 
cuted (the processor must determine whether any of the preceding 
instructions are context synchronizing). 
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lIloLL UCLlUIl Ul XjVCIII 


xvctjuircu Dciurc 


j\.c(^uircu rvitcr 




interrupt 


none 


none 




rft 


none 


none 




sc 


none 


none 




mttftst' (SF) 


none 






mttnsr (ILE) 


none 


none 




ntttfisr U K; 


none 






tnttnsT (ME) 


none 




-I 
1 


tntmsr \\Ji\.) 


none 


r^QT 




tntmsv (LE) 






•3 

J 


fntsT[ifi] 








Mt+C'tw 1 A CP \ 

fntspv (AoJx) 








ffttspr [buKi } 


sync 


L>M 


6, 7 


mtspr (UdAI ) 








mtspr (UAbK) 






5 


mtspr (EAR) 


CSI 


CSI 






CSI 


CSI or sync 


8 




CSI 


CSI or sync 


8 


tlhie 


CSI 


CSI or sywc 


8,9 


tibia 


CSI 


CSI or sync 


8,9 



Table 1 2. Synchronization requirements for data access 
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interrupt 
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rft 
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tnttnsT (ME) 
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1 1 


^..^■f-^j-.^^ / A CD \ 

mtspr (AoKj 


none 


CSI 


11 


tntspy (oUKi j 


sync 




O, / 




none 




1 1 


w^5pr (DEC) 


none 


none 


12 




none 


CSI or sync 


8 




none 


CSI or sywc 


8 




none 


CSI or sync 


8,9 




none 


CSI or sywc 


8,9 



Table 13. Synchronization requirements for instruction fetch and/or execu- 
tion 
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Notes: 

1 . A context synchronizing instruction is required after altering the ME 
bit to ensure that the aheration takes effect for subsequent Machine 
Check interrupts, which may not be recoverable and therefore may 
not be context synchronizing. 

2. Synchronization requirements for changing the power conserving 
mode are implementation-dependent, and are specified in the Book 
IV, PowerPC Implementation Features document for the implemen- 
tation. 

3. Synchronization requirements for changing from one Endian mode 
to the other are implementation-dependent, and are specified in the 
Book IV, PowerPC Implementation Features document for the 
implementation. 

4. The effect of changing the EE bit is immediate. 

■ If an mtmsr instruction sets the EE bit to 0, neither an External 
interrupt nor a Decrementer interrupt occurs after the mtmsr is 
executed. 

■ If an mtmsr instruction changes the EE bit from 0 to 1 when an 
External, Decrementer, or higher priority exception exists, the cor- 
responding interrupt occurs immediately after the mtmsr is exe- 
cuted, and before the next instruction is executed in the program 
that set EE to 1. 

5. Synchronization requirements for changing the Data Address Break- 
point Register are implementation-dependent, and are specified in 
the Book IV, PowerPC Implementation Features document for the 
implementation. 

6. SDRl must not be altered when MSRdr=1 or MSRir=1; if it is, the 
results are undefined. 

7- A sync instruction is required before the mtspr instruction because 
SDRl identifies the Page Table and thereby the location of Reference 
and Change bits. To ensure that Reference and Change bits are 
updated in the correct Page Table, SDRl must not be altered until all 
Reference and Change bit updates due to instructions before the 
mtspr have completed. A sync instruction guarantees this synchroni- 
zation of Reference and Change bit updates, while neither a context 
synchronizing operation nor the instruction fetching mechanism 
does so. 
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Programming Note 

Regarding Note 8, the 
following sequence 
illustrates why it is 
necessary, for data 
accesses, to ensure that 
all storage accesses due 
to instructions before 
the sibie, sibia, tibie, or 
tibia have completed to a 
point at which they have 
reported all exceptions 
they will cause. Assume 
that valid Segment Table 
and Page Table entries 
exist for the target 
storage location when 
the sequence starts. 

1. A program issues a 
load or store to a 
segment (page). 

2. The same program 
marks the entry for the 
target segment (page) 
invalid in the Segment 
Table (Page Table). 

3. The same program 
executes an sibie or 
sibia {tibie or tibia) 
that invalidates the 
corresponding SLB 
entry (TLB entry). 

4. The load or store 
instruction finally 
executes, and gets a 
segment fault (page 
fault). 

The segment fault or 
page fault is semantically 
incorrect. In order to 
prevent it, a context 
synchronizing instruction 
must be executed 
between steps 1 and 2 



8. For data accesses, the context synchronizing instruction before the 
slbie, slbia, tlbie, or tibia instruction ensures that all storage accesses 
due to preceding instructions have completed to a point at which 
they have reported all exceptions they will cause. 

The context synchronizing instruction after the slbie, slbia, tlbie, or 
tibia ensures that subsequent storage accesses (data and instruction) 
will not use the SLB or TLB entry (s) being invalidated. It does not 
ensure that all storage accesses previously translated by the SLB or 
TLB entry (s) being invalidated have completed with respect to stor- 
age or, for tlbie or tibia, that Reference and Change bit updates asso- 
ciated with those storage accesses have completed; if these 
completions must be ensured, the slbie, slbia, tlbie, or tibia must be 
followed by a sync instruction rather than by a context synchroniz- 
ing instruction. 

Section 4.12, "Table Update Synchronization Requirements," on 
page 446 gives examples of the synchronization required when using 
slbie or tlbie in a sequence that alters a Segment Table Entry or a 
Page Table Entry. 

9. Multiprocessor systems have other requirements to synchronize 
"TLB shoot down" (i.e., to invalidate one or more TLB entries on all 
processors in the multiprocessor system and be able to determine 
that the invalidations have completed and that all side effects of the 
invalidations have taken effect). 

10. The alteration must not cause an implicit branch in effective address 
space. Thus the mtmsr instruction and all subsequent instructions, 
up to and including the next context synchronizing instruction, must 
have effective addresses that are less than 2^^. 

11. The alteration must not cause an implicit branch in real address 
space. Thus the real address of the context-altering instruction and of 
each subsequent instruction, up to and including the next context 
synchronizing instruction, must be independent of whether the alter- 
ation has taken effect. 

12. The elapsed time between the content of the Decrementer becoming 
negative and the signaling of the Decrementer exception is not 
defined. 
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Optional Facilities 
and Instructions 




The facilities (Special Purpose Registers and instructions) described in 
this appendix are optional. An implementation may choose to provide 
all, some, or none of them. If a facility is implemented that matches the 
semantics of a facility described here, the implementation should be as 
specified here. 

A.1 External Control 

The External Control facility provides a means for a problem state pro- 
gram to communicate with a special-purpose device. Two instructions 

are provided: 

■ External Control In Word Indexed (eciwx), which does the following: 

— Computes an effective address (EA) as for any X-form instruction 

— Validates the EA as would be done for a load from that address 

— Translates the EA to a real address 

— Transmits the real address to the device 

— Accepts a word of data from the device and places it into a general 
purpose register 
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■ External Control Out Word Indexed (ecowx), which does the follow- 
ing: 

— Computes an effective address (EA) as for any X-form instruction 

— Validates the EA as would be done for a store to that address 

— Translates the EA to a real address 

— Transmits the real address and a word of data from a general 
purpose register to the device 

Depending on the setting of a control bit in a Special Purpose Register, 
the External Access Register (EAR), the processor either performs the 
external control operation or generates a Data Storage interrupt. The 
EAR controls access to the External Control facility. Access to the EAR 
itself is privileged; the operating system can determine which tasks are 
allowed to issue External Access instructions and when they are allowed 
to do so. 

The data access of eciwx and ecowx is performed as though the stor- 
age access mode bits "WIMG" were 0101 (see Section 4.8). 

Interpretation of the real address transmitted by eciwx and ecowx and 
the 32-bit value transmitted by ecowx is up to the target device. Such 
interpretation is not specified by the PowerPC Architecture. See the Sys- 
tem Architecture documentation for a given PowerPC system for details 
on how the External Control facility can be used with devices on that sys- 
tem. 

Example: An example of a device designed to be used with the External 
Control facility might be a graphics adapter. The ecowx instruction 
might be used to send the device the translated real address of a buffer 
containing graphics data, and the word transmitted from the general 
purpose register might be control information that tells the adapter what 
operation to perform on the data in the buffer. The eciwx instruction 
might be used to load status information from the adapter. 

A.1.1 External Access Register 

This 32-bit Special Purpose Register controls access to the External Con- 
trol facility and, for external control operations that are permitted, deter- 
mines which device is the target. 
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/// 



RID 



Bit Name Description 



0 E Enable bit 

26:31 RID Resource ID 

All other fields are reserved. 



Figure 84. External Access Register 



The high-order bits of the RID field that correspond to bits of the 
Resource ID beyond the width of the Resource ID supported by a partic- 
ular implementation are treated as reserved bits. 

A.1.2 External Access Instructions 

External Control In Word Indexed X-form 



eciwx RT,RA,RB 



31 


RT 


RA 


RB 


310 


/ 


0 


6 


11 


16 


21 


31 



if RA = 0 then b ^ 0 
else b <- (RA) 

EA b + (RB) 
if EARe = 1 then 

raddr <r- address translation of EA 

send load word request for raddr to 
device identified by EARrid 

RT <r- ^^0 II word from device 

el se 

DSISRii <r- 1 

generate Data Storage interrupt 

Let the effective address (EA) be the sum (RAIO)+(RB). 

If EAR£=1, a load word request for the real address corresponding to 
EA is sent to the device identified by EARrij), bypassing the cache. 
RTo:3i{} is set to 0. The word returned by the device is placed in 
I^T32;63{0:31}- 

If EAR£=0, a Data Storage interrupt is taken, with bit 1 1 of DSISR set 
tol. 
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EA must be a multiple of 4. If it is not, one of the following occurs: 

■ an Alignment interrupt is generated 

■ a Data Storage interrupt is generated (possible only if EAR£=0) 

■ the results are boundedly undefined 

The ectwx instruction is supported for effective addresses that refer- 
ence ordinary segments (T=0) and for EAs mapped by DBAT registers. If 
the EA refers to a direct-store segment (T=l), either a Data Storage inter- 
rupt occurs or the results are boundedly undefined. If this instruction is 
executed when MSRdr=0 (real addressing mode), the results are bound- 
edly undefined. 

This instruction is treated as a load from the addressed byte with 
respect to address translation, storage protection, reference and change 
recording, and the ordering done by eieio. 

Special Registers Altered 

None 



External Control Out Word Indexed X-form 

ecowx RS,RA,RB 



31 


RS 


RA 


RB 


438 


/ 


0 


6 


11 


16 


21 


31 



1f RA = 0 then b «- 0 
else b <- (RA) 

EA ^ b + (RB) 
if EARe = 1 then 

raddr <- address translation of EA 
send store word request for raddr to 

device identified by EARrjd 
send (RS32:63{0:31}) to device 

el se 

DSISRii <- 1 

generate Data Storage interrupt 

Let the effective address (EA) be the sum (RAI0)+(RB). 

If EAR£=1, a store word request for the real address corresponding to 
EA and the contents of RS32:63{0:31) sent to the device identified by 
EARrj£), bypassing the cache. 
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If EAR£=0, a Data Storage interrupt is taken, with bit 1 1 of DSISR set 
to 1. 

EA must be a multiple of 4. If it is not, one of the following occurs: 

■ an Alignment interrupt is generated 

■ a Data Storage interrupt is generated (possible only if EAR£=0) 

■ the results are boundedly undefined 

The ecowx instruction is supported for effective addresses that refer- 
ence ordinary segments (T=0) and for EAs mapped by DBAT registers. If 
the EA refers to a direct-store segment (T=l), either a Data Storage inter- 
rupt occurs or the results are boundedly undefined. If this instruction is 
executed when MSRdr=0 (real addressing mode), the results are bound- 
edly undefined. 

This instruction is treated as a store to the addressed byte with respect 
to address translation, storage protection, reference and change record- 
ing, and the ordering done by eieio. 

Special Registers Altered 

None 
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Mnemonics 




In order to make assembler language programs simpler to write and eas- 
ier to understand, a set of extended mnemonics and symbols is provided 
that defines simple shorthand for the most frequently used forms of 
Branch Conditional, Compare, Trap, Rotate and Shift, and certain other 
instructions. 

This appendix defines extended mnemonics related to mtspr and 
mfspr^ including privileged SPRs. 

Assemblers should provide the mnemonics and symbols Usted here, 
and may provide others. 



B.1 Move To/From Special Purpose 
Register iVInemonics 

The mtspr and mfspr instructions specify a Special Purpose Register 
(SPR) as a numeric operand. Extended mnemonics are provided that rep- 
resent the SPR in the mnemonic rather than requiring it to be coded as an 
operand. Also shown here are extended mnemonics for Move From Time 
Base and Move From Time Base Upper, which are variants of the mftb 
instruction rather than of mfspr. 

Note: mftb serves as both a basic and an extended mnemonic. The 
assembler will recognize an mftb mnemonic with two operands as the basic 
form, and an mftb mnemonic with one operand as the extended form. 
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Special Purpose Register 


Move To SPR 


Move From SPR^ 


Extended 


Equivalent to 


Extended 


Equivalent to 


Fixed Point Exception Register 


mtxer Rx 


mtspr l,Rx 


mfxer Rx 


mfspr Rx,l 


Link Register 


mtlr Rx 


mtspr 8,Rx 


mflr Rx 


mfspr Rx,8 


Count Register 


mtctr Rx 


mtspr 9,Rx 


mfctr Rx 


mfspr Rx,9 


Data Storage Interrupt Status Register 


mtdsisr Rx 


mtspr 18,Rx 


mfdsisr Rx 


mfspr Rx,18 


Data Address Register 


mtdar Rx 


mtspr 19,Rx 


mfdar Rx 


mfspr Rx,19 


Decrementer 


mtdec Rx 


mtspr 22,Rx 


mfdec Rx 


mfspr Rx,22 


Storage Description Register 1 


mtsdrl Rx 


mtspr 25,Rx 


mfsdrl Rx 


mfspr Rx,25 


Save/Restore Register 0 


mtsrrO Rx 


mtspr 26,Rx 


mfsrrO Rx 


mfspr Rx,26 


Save/Restore Register 1 


mtsrrl Rx 


mtspr 27,Rx 


mfsrrl Rx 


mfspr Rx,27 


Special Purpose Registers GO through G3 


mtsprg «,Rx 


mtspr 272+«,Rx 


mfsprg Rx,« 


mfspr Rx,272+« 


Address Space Register 


mtasr Rx 


mtspr 280,Rx 


mfasr Rx 


mfspr Rx,280 


External Access Register 


mtear Rx 


mtspr 282,Rx 


mfear Rx 


mfspr Rx,282 


Time Base [Lower] 


mttbl Rx 


mtspr 284,Rx 


mftb Rx 


mftb Rx,268 


Time Base Upper 


mttbu Rx 


mtspr 285,Rx 


mftbu Rx 


mftb Rx,269 


Processor Version Register 


- 


- 


mfpvr Rx 


mfspr Rx,287 


IBAT Registers, Upper 


mtibatu «,Rx 


mtspr 528+2x«,Rx 


mfibatu Rx,« 


mfspr Rx,528+2x« 


IBAT Registers, Lower 


mtibatl «,Rx 


mtspr 529+2x«,Rx 


mfibatl Rx,« 


mfspr Rx,529+2x« 


DBAT Registers, Upper 


mtdbatu «,Rx 


mtspr 536+2x«,Rx 


mfdbatu Rx,« 


mfspr Rx,536+2x« 


DBAT Registers, Lower 


mtdbatl «,Rx 


mtspr 537+2x«,Rx 


mfdbatl Rx,« 


mfspr Rx,537+2x« 


^Except for mfth and mftbu. 











Table 14. Extended mnemonics for moving to/from an SPR 



Book III PowerPC Operating Environment Architecture 



Cross-Ref erence f ■Changed 
POWER Mnemonid^ 



The following table lists the POWER instruction mnemonics that have 
been changed in the PowerPC Operating Environment Architecture, 
sorted by POWER mnemonic. 

To determine the PowerPC mnemonic for one of these POWER mne- 
monics, find the POWER mnemonic in the second column of the table: 
the remainder of the line gives the PowerPC mnemonic and the page on 
which the instruction is described, as well as the instruction names. 

POWER mnemonics that have not changed are not listed, POWER 
instruction names that are the same in PowerPC are not repeated: i.e., for 
these, the last column of the table is blank. 



Page 


POWER 


PowerPC 












Mnemonic 


Instruction 


Mnemonic 


Instruction 


441 


mtsri 


Move To Segment Register Indirect 


mtsrin 




378 


svca 


Supervisor Call 


sc 


System Call 


444 


tlbi 


TLB Invalidate Entry 


tlbie 






The following instructions in the PowerPC Operating Environment 
Architecture are new: they are not in the POWER Architecture, dcbi 
exists in all PowerPC implementations, mfsrin exists only in 32-bit imple- 
mentations, and the SLB instructions exist only in 64-bit implementa- 
tions. The SLB and TLB instructions are optional. 

dcbi Data Cache Block Invalidate 

eciwx External Control In Word Indexed 

ecowx External Control Out Word Indexed 

mfsrin Move From Segment Register Indirect 

slbia SLB Invalidate All 

slbie SLB Invalidate Entry 

tibia TLB Invalidate All 

tlbsync TLB Synchronize 



Implementation-Spe 
SPRs 




This appendix lists Special Purpose Register (SPR) numbers assigned by 
the PowerPC Architecture Review Process for implementation-specific 
uses. If a register shown here is present in a particular implementation, a 
detailed description will be found in Book IV, PowerPC Implementation 
Features. 

The intent of this list is to ensure that if an SPR is needed for a partic- 
ular function on more than one implementation, the same SPR number 
will be used. 

Note that ordering of the bits shown in the table below matches the 
descriptions of the Move To/From Special Purpose Register instructions 
on pages 384 and 387. The two 5-bit halves of the SPR number are 
reversed from the order in which they appear in an assembled instruction. 





SPR 






decimal 




Register name 


Privileged 




spr5:9 spro.4 






1022 


11111 11110 


FPECR 


yes 


1023 


11111 11111 


PIR 


yes 



Floating-Point Exception Cause Register (FPECR) 

This register identifies the reason a Floating-Point Exception occurred. 

Processor ID Register (PIR) 

This register holds a value that distinguishes this processor from others in 
a multiprocessor. 



Interpretation of the 
DSISR as Set by an 
Alignment Interrupt 




For most causes of Alignment interrupt, the interrupt handler will emu- 
late the interrupting instruction. To do this, it needs the following char- 
acteristics of the interrupting instruction: 



Load or store 

Length (halfword, word, or doubleword) 

String, multiple, or elementary 

Fixed or float 

Update or non-update 

Byte reverse or not 

Is it dcbz} 



The PowerPC Architecture provides this information implicitly, by set- 
ting bits in the DSISR that identify the interrupting instruction type. It is 
not necessary for the interrupt handler to load the interrupting instruc- 
tion from storage. The mapping is unique except for a few exceptions 
that are discussed below. The near-uniqueness depends on the fact that 
many instructions, such as the fixed- and floating-point arithmetic 
instructions and the one-byte loads and stores, cannot cause an Align- 
ment interrupt. 

See Section 5.5.6, "Alignment Interrupt," on page 464 for a descrip- 
tion of how the opcode and extended opcode are mapped to a DSISR 
value for an X-, D-, or DS-form instruction that causes an Alignment 
interrupt. 



504 



Appendix F Interpretation of the DSiSR Set by an Alignment Interrupt 



The table that follows shows the inverse mapping: how the DSISR bits 
identify the interrupting instruction. The following notes are cited in the 
table. 

1. The instructions Iwz and Iwarx give the same DSISR bits (all zero). 
But if Iwarx causes an Alignment interrupt, it should not be emulated 
in any precise way. It is adequate for the Alignment interrupt handler 
simply to emulate the instruction as if it were an Iwz, The emulator 
must use the address in the DAR, rather than computing it from RAJ 
RB/D, because Iwz and Iwarx have different instruction formats. 

If opcode 0 ("Illegal or Reserved") can cause an Alignment interrupt, 
it will be indistinguishable to the interrupt handler from Iwarx and 
Iwz. 

2. These are distinguished by DSISR bits 12:13, which are not shown in 
the table. 

The Alignment interrupt handler will not be able to distinguish 
between a floating-point load or store interrupting because it is mis- 
aligned and one interrupting because it addresses direct-store. But this 
does not matter; in either case the access will be emulated using fixed- 
point instructions. 

The interrupt handler has no need to distinguish between an X-form 
instruction and the corresponding D- or DS-form instruction if one exists, 
and vice- versa. Therefore, two such instructions may yield the same 
DSISR value (all 32 bits). For example, stw and stwx may both yield 
either the DSISR value shown in the following table for stw^ or that 
shown for stwx. 



If DSISR 
15:21 is: 


then it is either 
X-form opcode: 


or D/DS- form 
opcode: 


so the instruction is: 


00 0 0000 


OOOOOxxxOO 


xOOOOO 


Iwarx, Iwz, reserved^ 


00 0 0001 


OOOlOxxxOO 


xOOOlO 


Idarx 


00 0 0010 


OOlOOxxxOO 


xOOlOO 


stw 


00 0 0011 


OOllOxxxOO 


xOOllO 




00 0 0100 


OlOOOxxxOO 


xOlOOO 


Ihz 


00 0 0101 


OlOlOxxxOO 


xOlOlO 


lha 


00 0 Olio 


OllOOxxxOO 


xOllOO 


sth 
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If DSISR 
15:21 is: 


then it is either 
X-form opcode: 


or D/DS- form 
opcode: 


so the instruction is: 


00 0 0111 


OlllOxxxOO 


xOlllO 


Imw 


00 0 1000 


lOOOOxxxOO 


xlOOOO 


Ifs 


00 0 1001 


lOOlOxxxOO 


xlOOlO 


Ifd 


00 0 1010 


lOlOOxxxOO 


xlOlOO 


stfs 


00 0 1011 


lOllOxxxOO 


xlOllO 


stfd 


00 0 1100 


llOOOxxxOO 


xllOOO 


- 


00 0 1101 


llOlOxxxOO 


xllOlO 


Id, Idu, iwa^ 


00 0 1110 


lllOOxxxOO 


xlllOO 


- 


00 0 1111 


llllOxxxOO 


xllllO 


std, stdu^ 


00 1 0000 


OOOOlxxxOO 


xOOOOl 


iwzu 


00 1 0001 


OOOllxxxOO 


xOOOll 


- 


00 1 0010 


OOlOlxxxOO 


xOOlOl 


stwu 


00 1 0011 


OOlllxxxOO 


xOOlll 


- 


00 1 0100 


OlOOlxxxOO 


xOlOOl 


Ihzu 


00 1 0101 


OlOllxxxOO 


xOlOll 


lhau 


00 1 0110 


OllOlxxxOO 


xOllOl 


sthu 


00 1 0111 


OllllxxxOO 


xOllll 


stmw 


00 1 1000 


lOOOlxxxOO 


xlOOOl 


Ifsu 


00 1 1001 


lOOllxxxOO 


xlOOll 


Ifdu 


00 1 1010 


lOlOlxxxOO 


xlOlOl 


stfsu 


00 1 1011 


lOlllxxxOO 


xlOlll 


stfdu 


00 1 1100 


llOOlxxxOO 


xllOOl 




00 1 1101 


llOllxxxOO 


xllOll 




00 1 1110 


lllOlxxxOO 


xlllOl 




00 1 1111 


lllllxxxOO 


xlllll 
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If DSISR 
15;21 is: 


then it is either 
X-form opcode: 


or D/DS- form 
opcode: 


so the instruction is: 


01 0 0000 


OOOOOxxxOl 




Idx 


01 0 0001 


OOOlOxxxOl 




- 


01 0 0010 


OOlOOxxxOl 




stdx 


01 0 0011 


OOllOxxxOl 




- 


01 0 0100 


OlOOOxxxOl 




- 


01 0 0101 


OlOlOxxxOl 




Iwax 


010 0110 


OllOOxxxOl 




- 


01 0 0111 


OlllOxxxOl 




- 


01 0 1000 


lOOOOxxxOl 




Iswx 


01 0 1001 


lOOlOxxxOl 




Iswi 


01 0 1010 


lOlOOxxxOl 




stswx 


010 1011 


lOllOxxxOl 




stswi 


01 0 1100 


llOOOxxxOl 




- 


01 0 1101 


llOlOxxxOl 




- 


01 0 1110 


lllOOxxxOl 




- 


01 0 1111 


llllOxxxOl 




- 


01 1 0000 


OOOOlxxxOl 




Idux 


01 1 0001 


OOOllxxxOl 




- 


01 1 0010 


OOlOlxxxOl 




stdux 


01 1 0011 


OOlllxxxOl 




- 


01 1 0100 


OlOOlxxxOl 




- 


01 1 0101 


OlOllxxxOl 




Iwaux 


01 1 Olio 


OllOlxxxOl 






01 1 0111 


OllllxxxOl 






01 1 1000 


lOOOlxxxOl 
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If DSISR 
15:21 is: 


then it is either 
X-form opcode: 


or D/DS- form 
opcode: 


so the instruction is: 


01 1 1001 


lOOllxxxOl 




- 


01 1 1010 


lOlOlxxxOl 




- 


01 1 1011 


lOlllxxxOl 




- 


01 1 1100 


llOOlxxxOl 




- 


01 1 1101 


llOllxxxOl 




- 


01 1 1110 


lllOlxxxOl 




- 


01 1 nil 


lllllxxxOl 




- 


10 0 0000 


OOOOOxxxlO 




- 


10 0 0001 


OOOlOxxxlO 




- 


10 0 0010 


OOlOOxxxlO 




stwcx. 


10 0 0011 


OOllOxxxlO 




stdcx. 


10 0 0100 


OlOOOxxxlO 




- 


10 0 0101 


OlOlOxxxlO 




- 


10 0 0110 


OllOOxxxlO 




- 


10 0 0111 


OlllOxxxlO 




- 


10 0 1000 


lOOOOxxxlO 




Iwbrx 


10 0 1001 


lOOlOxxxlO 




- 


10 0 1010 


lOlOOxxxlO 




stwbrx 


10 0 1011 


lOllOxxxlO 




- 


10 0 1100 


llOOOxxxlO 




Ihbrx 


10 0 1101 


llOlOxxxlO 




- 


10 0 1110 


lllOOxxxlO 




sthbrx 


10 0 1111 


llllOxxxlO 






10 1 0000 


OOOOlxxxlO 






10 1 0001 


OOOllxxxlO 
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If DSISR 
15:21 is: 


then it is either 

X-form opcode: 


or D/DS- form 
opcode: 


so the instruction is: 


10 1 0010 


OOlOlxxxlO 




- 


10 1 0011 


OOlllxxxlO 




- 


10 1 0100 


OlOOlxxxlO 




eciwx 


10 1 0101 


OlOllxxxlO 




- 


10 1 0110 


OllOlxxxlO 




ecowx 


10 1 0111 


OllllxxxlO 




- 


10 1 1000 


lOOOlxxxlO 




- 


10 1 1001 


lOOllxxxlO 




- 


10 1 1010 


lOlOlxxxlO 




- 


10 1 1011 


lOlllxxxlO 




- 


10 1 1100 


llOOlxxxlO 




- 


10 1 1101 


llOllxxxlO 




- 


10 1 1110 


lllOlxxxlO 




- 


10 1 1111 


lllllxxxlO 




dcbz 


11 0 0000 


OOOOOxxxll 




Iwzx 


11 0 0001 


OOOlOxxxll 




- 


11 0 0010 


OOlOOxxxll 




stwx 


11 0 0011 


OOllOxxxll 




- 


11 0 0100 


OlOOOxxxll 




Ihzx 


11 0 0101 


OlOlOxxxll 




lhax 


11 0 0110 


OllOOxxxll 




sthx 


11 00111 


OlllOxxxll 






11 0 1000 


lOOOOxxxll 




Ifsx 


11 0 1001 


lOOlOxxxll 




Ifdx 


11 0 1010 


lOlOOxxxll 




stfsx 
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It JJoioK 
15:21 is: 


then it is either 
X-form opcode: 


or u/ub- rorm 
opcode: 


so the instruction is: 


11 0 1011 


lOllOxxxll 




stfdx 


11 0 1100 


llOOOxxxll 




- 


110 1101 


llOlOxxxll 




- 


110 1110 


lllOOxxxll 




- 


11 0 1111 


llllOxxxll 




stfiwx 


11 1 0000 


OOOOlxxxll 




Iwzux 


11 1 0001 


OOOllxxxll 




- 


11 1 0010 


OOlOlxxxll 




stwux 


11 1 0011 


OOlllxxxll 




- 


11 1 0100 


OlOOlxxxll 




Ihzux 


11 1 0101 


OlOllxxxll 




lhaux 


11 1 0110 


OllOlxxxll 




sthux 


11 1 0111 


Ollllxxxll 




- 


11 1 1000 


lOOOlxxxll 




Ifsux 


11 1 1001 


lOOllxxxll 




Ifdux 


11 1 1010 


lOlOlxxxll 




stfsux 


11 1 1011 


lOlllxxxll 




stfdux 


11 1 1100 


llOOlxxxll 






11 1 1101 


llOllxxxll 






11 1 1110 


lllOlxxxll 






11 1 1111 


lllllxxxll 
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PowerPC Operati 
Environment 
Instruction Set 




Form 


Opcode 


Mode 
Dep.l 


Priv.^ 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


X 


31 


470 




P 


439 


dcbi 


Data Cache Block Invalidate 


X 


31 


310 






491 


eciwx 


External Control In Word Indexed 


X 


31 


438 






492 


ecowx 


External Control Out Word Indexed 


X 


31 


83 




P 


389 


mfmsr 


Move From Machine State Register 


XFX 


31 


339 




O 


387 


mfspr 


Move From Special Purpose Register 


X 


31 


595 


{} 


P 


441 


mfsr 


Move From Segment Register 


X 


31 


659 


{} 


P 


442 


mfsrin 


Move From Segment Register Indirect 


X 


31 


146 




P 


389 


mtmsr 


Move To Machine State Register 


XFX 


31 


467 




O 


384 


mtspr 


Move To Special Purpose Register 


X 


31 


210 


{} 


P 


440 


mtsr 


Move To Segment Register 


X 


31 


242 


{} 


P 


441 


mtsrin 


Move To Segment Register Indirect 


XL 


19 


50 




P 


379 


rfi 


Return From Interrupt 


SC 


17 








378 


SC 


System Call 



512 



Appendix G PowerPC Operating Environment instruction Set 



Form 


opcode 


Mode 


Priv.^ 


Page 


Mnemonic 


Instruction 


Primary 


Extended 


Dep. 


X 


31 


498 


0 


P 


444 


slbia 


SLB Invalidate All 


X 


31 


434 


0 


P 


443 


slbie 


SLB Invalidate Entry 


X 


31 


370 




P 


445 


ribia 


TLB Invalidate All 


X 


31 


306 




P 


444 


tlbie 


TLB Invalidate Entry 


X 


31 


566 




P 


445 


tlbsync 


TLB Synchronize 



Key to Mode Dependency Column 

Parentheses () are shown if the instruction is defined only for 64-bit implementations. 
Braces {} are shown if the instruction is defined only for 32-bit implementations. 
All instructions in the PowerPC Operating Environment Architecture are mode-inde- 
pendent, except that if the instruction refers to storage when in 32-bit mode, only the 
low-order 32 bits of the 64-bit effective address are used to address storage. 

^Key to Privilege Column 

P denotes a privileged instruction. 

O denotes an instruction that may be treated as privileged or non-privileged, depend- 
ing on the SPR number. 
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Index 



A 

AA19 

address 27, 398, 399, 401, 402, 404, 406, 408, 412, 
413, 415, 417, 423, 433 

effective 29 

real 399 
A-form 18 
aliasing 333 
alignment 464, 503 

effect 339 
ASR 402 

assembler 215, 495 
atomic 336 
atomicity 

single-copy 322 

B 

BA 19 

BAT 399, 423 
BB 19 
BD 19 
BE 376 
BF 19 
BFA19 
B-form 14 
BI 20 

Big-Endian 233 
block 320, 399, 423 
BO 20 

branch 470 
BT 20 
byte 233 
bytes 5 



c 

C139 
CA48 

cache 327, 343, 344 
caching 392, 430 
change 433, 440, 446 
CIA 10 
coherence 431 
combined 332 
combining 

accesses 431 

stores 430 
context 371 

definition 368 
CR 32 
CTR 35 

D 

D20 

DAR 381, 466 
data 346, 460, 462, 466 
dcbf 349, 460 
dcbi 439, 460 
dcbst 348, 460 
debt 346, 460 
dcbtst 347, 460 
dcbz 347, 460, 465 
DEC 481 
decrementer 470 
defined 23 
delayed 460 
denormalization 147 
denormalized 144 
D-form 15 



514 



INDEX 



direct-store 421 
double precision 147 
doublewords 5 
DR 377 
DS 20 
DS-form 15 
DSISR 382 

alignment 503 
dual 328 

E 

EA 29 
EAR 490 
eciwx 460, 491 
ecowx 460, 492 
EE 375 

effective 29, 391, 399, 402, 413 

eieio 350 

EQ 33, 34 

exception 368 

execution 372 

external 460, 464 

F 

FE 34, 139 
FEO 376 
FEl 376 
FEX 137 
FG 34, 139 
FI138 
FL 34, 139 
FLM 20 

floating point 469 

denormalization 147 
double precision 147 
exceptions 135, 150 

inexact 162 

invalid 155 

overflow 159 

underflow 160 

zero 158 
execution 162 
normalization 147 
number 

denormalized 144 

infinity 144 

normalized 143 

not 145 



zero 144 
rounding 149 
sign 146 

single precision 147 
floating-point 471 
FP376 
FPCC 139 
FPR 135 
FPRF 139 
FPSCR 137 

C139 

FE139 

FEX 137 

FG139 

FI 138 

FL 139 

FPCC 139 

FPRF 139 

FR138 

FU 139 

FX 137 

NI 140 

OE 140 

OX 138 

RN 141 

UE 140 

UX138 

VE140 

VX137 

VXCVI 139 

VXIDI 138 

VXIMZ 138 

VXISI 138 

VXSNAN 138 

VXSOFT 139 

VXSQRT 139 

VXVC 138 

VXZDZ 138 

XE 140 

XX 138 

ZE 140 

ZX138 
FR138 
FRA20 
FRB20 
FRC 20 
FRS 20 
FRT20 
FU 35, 139 
FX 137 
FXM20 



INDEX 
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G 

GPR 47 
GT 33, 34 
Guarded 397, 431 
Gulliver's Travels 233 

H 

halfwords 5 
hardware 8, 368 
hashed 406, 410, 417, 419 
HTAB 406, 417 

search 410, 419 

I 

icbi 345, 460 
I-form 14 
ILE 375 
illegal 24 
implicit 395 
inexact 162 
infinity 144 
inhibited 430 

instruction 344, 396, 463, 465 
fields 19, 20, 21, 22, 23, 370 
AA19 
BA19 
BB 19 
BD 19 
BF 19 
BFA19 
BI20 
BO 20 
BT20 
D20 
DS 20 
FLM20 
FRA 20 
FRB 20 
FRC 20 
FRS 20 
FRT20 
FXM20 
L21 
LI 21 
LK21 
MB 21 
ME 21 



NB 21 
OE 22 
RA 22 
RB22 
Rc22 
RS 22 
RT22 
SH 22 
SI 22 

SPR 22, 371 
SR 22, 371 
TBR 22 
TO 22 
U23 
UI 23 
XO 23 

formats 12, 14, 15, 16, 17, 18, 370 

A-form 18 

B-form 14 

D-form 15 

DS-form 15 

I-form 14 

M-form 18 

MD-form 18 

MDS-form 19 

SC-form 14 

X-form 16 

XFL-form 17 

XFX-form 17 

XL-form 17 

XO-form 18 

XS-form 17 
instruction-caused 454 
instructions 
classes 23 
dcbf 349, 460 
dcbi 439, 460 
dcbst 348, 460 
debt 346, 460 
dcbtst 347, 460 
dcbz 347, 460, 465 
defined 23 

forms 25 
eciwx 460, 491 
ecowx 460, 492 
eieio 350 
icbi 345, 460 
illegal 24 
invaUd 25 
isync 346 

Idarx 336, 460, 465 
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INDEX 



Iwarx 336, 460, 465 
optional 26, 489 
preferred 25 
reserved 24 
rfi 458 
slbia 443 
slbie 443 

stdcx. 336, 460, 465 

storage 343, 438 

stwcx. 336, 460, 465 

sync 334 

tibia 443 

tlbie 443 

tlbsync 443 
interrupt 368, 453, 475 

recoverable 457 
interrupts 

Alignment 464 

Data 460 

Decrementer 470 

External 464 

floating point 469, 471 

Instruction 463 

instruction-caused 454 

Machine 459 

new 457 

precise 454 

Program 467 

System 457, 470 

system-caused 454 

Trace 470 
invalid 25, 155 
IP 377 
IR377 
isync 346 

K 

K 436, 437 
key, 436, 437 

L 

L21 

language 8 
Idarx 460, 465 
LE377 
LI 21 

Little-Endian 233 
LK21 



load 320 
lookaside 483 
LR 35 
LT 33, 34 
Iwarx 460, 465 

M 

machine 458, 459 
Machine State Register 

Branch 376 

Data 377 

External 375 

FP 376 

Instruction 377 
Interrupt 375, 377 
Little-Endian 377 
Machine 376 
Power 375 
Problem 375 
Recoverable 377 
Single-Step 376 
Sixty-Four-bit 375 

main 319 

MB 21 

MD-form 18 

MDS-form 19 

ME 21, 376 

memory 323,392, 431 

M-form 18 

mismatched 433 

mnemonics 

extended 215, 495 

MSR 374 

N 

NB21 

next 378, 379 
NI140 
NIAIO 
no-op 107 
normalization 147 
normalized 143 
not 145 

0 

OE 22, 140 
optional 26 



INDEX 
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OV48 

overflow 159 
OX 138 

P 

page 320, 393, 406, 408, 410, 417, 419, 433, 437, 446 

POW 375 

PP 436, 437 

PR 375 

precise 454 

preferred 25 

prefetch 

instruction 396 
program 319, 467 
protection 465 
PTE 408, 417 
PVR 383 

Q 

quadwords 5 

R 

RA 22 

RB 22 

RC 433 

Rc 22 

real 399 

recoverable 457 

reference 433, 440, 446 

register 8 

registers 

Address 402 

Condition Register 32 

Count Register 35 

DAR 

Data 462, 466 
Data Address Register 381, 466 
Data Storage Interrupt Status Register 382 
Decrementer 481 
EAR 

External 460 
External 490 

Fixed-Point Exception Register 48 
Floating-Point Registers 135 
Floating-Point Status and Control Register 137 
General Purpose Registers 47 
implementation-specific 501 



Link Register 35 
Machine 373, 374 
Machine State Register 374 
MSR 

Machine 458 

optional 489 

Processor 383 

SDRl 409,418 

Segment 483 

SPRGn 382 

SPRs 381,483, 501 

SRRO 373 

Machine 458 

SRRl 374 

Machine 458 

status 483 

Time Base 351, 479 
reserved 8, 24, 369 
rfi 458 
RI377 
RN 141 
rounding 149 
RS 22 
RT22 
RTL 8, 369 

s 

SC-form 14 
SDRl 409, 418 
SE376 

segment 404, 406, 446, 483 
direct-store 399, 421 
ordinary 399 

SF375 

SH 22 

SI 22 

sign 146 

single precision 147 
single-copy 322 
single-step 470 
SLB 406 
slbia 443 
slbie 443 
SO 33, 35, 48 
speculative 396 
split 14, 328 
SPR 22, 371 
SPRGn 382 
SPRs 381, 483 
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INDEX 



SR 22, 371 
SRRO 373 
SRRl 374 
STAB 404 

search 404 
status 483 
stdcx 460, 465 
STE 404 

storage 27, 167, 319, 320, 343, 392, 396, 429, 431, 
436, 437, 438 
access 333, 342 
atomic 336 
coherence 323 
consistency 392 
Guarded 397 
instruction 341 
order 333 

ordering 333, 334, 350, 392 

reservation 337 

segments 393 

shared 333 

weak 392 
store 320 
stwcx 460, 465 
Swift, Jonathan 233 
symbols 215, 495 
sync 334, 446 

synchronization 371, 446, 483 

context 371 

execution 372 

interrupts 453 
system 457, 470 
system-caused 454 

T 

table 446 
TB 351,479 
TBL 351, 479 
TBR 22 
TBU 351, 479 
32-bit 398 
Time Base 351,479 
TLB 412, 421 
tibia 443 
tlbie 443 
tlbsync 443 
TO 22 
trace 470 

translation 412, 421 
trap 368 



u 

U23 

UE140 

UI23 

underflow 160 
UX138 

V 

VE140 

virtual 321, 399, 402, 406, 413, 415 

VX137 

VXCVI 139 

VXIDI 138 

VXIMZ 138 

VXISI 138 

VXSNAN 138 

VXSOFT 139 

VXSQRT 139 

VXVC 138 

VXZDZ 138 

w 

WIMG 399, 421, 430 
words 5 

write 332, 392, 430 

X 

X-form 16 
XE 140 
XER 48 
XFL-form 17 

XFX-form 17 
XL-form 17 
XO 23 
XO-form 18 
XS-form 17 
XX 138 

Z 

ZE140 
zero 144, 158 
ZX138 
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This is the official technical description of the PowerPC architecture and its 
hardware conventions, developed jointly by IBM, Motorola, and Apple. An essential 
reference for hardware and system software designers and applications programmers 
developing a range of products using implementations of the PowerPC family of 
microprocessors — from palmtops to teraFLOPS. 

The PowerPC Architecture provides a stable base for software, allowing 
applications that run on one PowerPC processor to run consistently on any other 
PowerPC processor. In addition, well-designed operating systems can be moved 
from one processor implementation to another by making only a few minor changes. 

To achieve this, the specification of architecture has been structured into 
three Books, corresponding to a distinct level of the architecture: 

Book I, User Instruction Set Architecture, describes the registers, instructions, 
storage model, and execution model that are available to all application 
programs. 

Book II, Virtual Environment Architecture, describes features of the architecture 
that permit application programs to create or modify code, to share data 
among programs in a multiprocessing system, and to optimize the performance 
of storage accesses. 

Book III, Operating Environment Architecture, describes features of the 
architecture that permit operating systems to allocate and manage storage, 
to handle errors encountered by application programs, to support I/O devices, 
and to provide the other services expected of secure, modern, multiprocessor 
operating systems. 

An important feature of these specifications is that they only constrain 
implementations on matters that affect software compatibility. Even more significant, 
they specify the architecture in a manner that is independent of implementation. 

The PowerPC Architecture is a must for anyone who needs to understand the 
levels of compatibility between different processors in the PowerPC family — the 601 
microprocessor, the 603 (low-end battery-powered requirements), 604 (optimized 
price/performance for scaleable symmetric multiprocessors), and the 620 (for high- 
end technical and commercial requirements for absolute performance). 



OF RELATED INTEREST: 

POWER and PowerPC: 

Principles m Architecture ■ Implement 

Shlomo Weiss and James E. Smith 



Morgan Kaufmann Publishers, Inc. 
San Francisco, California 



5REa-5124-01 




